Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-3715

Fix fork out more than one transitions submit , one transition submit fail can't execute KillXCommand

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 5.2.1
    • 5.3.0
    • core
    • Patch

    Description

      When I fork 2 transitions( A and B) to submit , when A transition failed , B transition  still Running , because can't execute KillXCommand.

      SignalXCommand.startForkedActions, when one transition  submit fail will create a new ActionStartXCommand and invoke failJob, failJob will add WorkflowNotificationXCommand and KillXCommand to commandQueue , and callback at XCommand.call method , but we add WorkflowNotificationXCommand and KillXCommand to ActionStartXCommand‘s commandQueue  , but not SignalXCommand  ,  so can't execute KillXCommand. 

      The code is as follows :

       

          public void startForkedActions(List<WorkflowActionBean> workflowActionBeanListForForked) throws CommandException {
      
              ......
      
                  for (Future<ActionExecutorContext> result : futures) {
                   ......
                      if (context.getJobStatus() != null && context.getJobStatus().equals(Job.Status.FAILED)) {
      
                          new ActionStartXCommand(context.getAction().getId(), null).failJob(context);
                   ......
      
              }
             ......
          }
      

       

      public void failJob(ActionExecutor.Context context, WorkflowActionBean action) throws CommandException {
              WorkflowJobBean workflow = (WorkflowJobBean) context.getWorkflow();
              if (!handleUserRetry(context, action)) {
                  incrActionErrorCounter(action.getType(), "failed", 1);
                  LOG.warn("Failing Job due to failed action [{0}]", action.getName());
                  try {
                      workflow.getWorkflowInstance().fail(action.getName());
                      WorkflowInstance wfInstance = workflow.getWorkflowInstance();
                      ((LiteWorkflowInstance) wfInstance).setStatus(WorkflowInstance.Status.FAILED);
                      workflow.setWorkflowInstance(wfInstance);
                      workflow.setStatus(WorkflowJob.Status.FAILED);
                      action.setStatus(WorkflowAction.Status.FAILED);
                      action.resetPending();
                      queue(new WorkflowNotificationXCommand(workflow, action));
                      queue(new KillXCommand(workflow.getId()));             InstrumentUtils.incrJobCounter(INSTR_FAILED_JOBS_COUNTER_NAME, 1, getInstrumentation());
                  }
                  catch (WorkflowException ex) {
                      throw new CommandException(ex);
                  }
              }
          }
      
      

       

      public final T call() throws CommandException {
          if (commandQueue != null) {
              for (Map.Entry<Long, List<XCommand<?>>> entry : commandQueue.entrySet()) {
                  LOG.debug("Queuing [{0}] commands with delay [{1}]ms", entry.getValue().size(), entry.getKey());
                  if (!callableQueueService.queueSerial(entry.getValue(), entry.getKey())) {
                      LOG.warn("Could not queue [{0}] commands with delay [{1}]ms, queue full", entry.getValue()
                          .size(), entry.getKey());
                  }
              }
           } 
      }
      

       

       

       

      Attachments

        1. forkSubmitFail_issue.txt
          2 kB
          chenhaodan
        2. OOZIE-3715-001.patch
          1 kB
          chenhaodan
        3. OOZIE-3715-002.patch
          8 kB
          chenhaodan
        4. OOZIE-3715-003.patch
          9 kB
          chenhaodan
        5. OOZIE-3715-004.patch
          9 kB
          chenhaodan
        6. OOZIE-3715-005.patch
          10 kB
          chenhaodan
        7. OOZIE-3715-006.patch
          11 kB
          chenhaodan
        8. status.png
          57 kB
          chenhaodan

        Activity

          People

            chenhd chenhaodan
            chenhd chenhaodan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: