Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-2558

Specify dataset done flag in regex format

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 4.2.0
    • None
    • coordinator
    • None

    Description

      Preconditions
      There is a business use case when custom empty file (like _SUCCESS) is used to indicate that folder is completed. In the same time for other folder _SUCCESS file is used as indication that it is completed. These two folders belong to one dataset. And there is a necessity to use data availability dataset feature.

      New Feature
      Please introduce a possibility to specify dataset done flag in regex format.

      Example:

      <datasets>
          <dataset name="xxx-in-dataset" frequency="5" initial-instance="${xxx_data_initial_instance}" timezone="${xxx_data_timezone}">
              <uri-template>${xxx_in_dir}/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}</uri-template>
              <done-flag format="regex">_SUCCESS|_XXX</done-flag>
          </dataset>
      </datasets>
      

      Regex _SUCCESS|_XXX - indicates to look for _SUCCESS or _XXX file.

      Attachments

        Activity

          People

            Unassigned Unassigned
            nvolynets Nazar Volynets
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: