Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-7998

MatchesFiles or MatchAll seems to return several times the same element

Details

    • Bug
    • Status: Open
    • P3
    • Resolution: Unresolved
    • 2.14.0
    • None
    • io-py-files
    • GCP for storage, DirectRunner and DataflowRunner both have the problem. PyCharm on Win10 for IDE and dev environment.
    • Important

    Description

      Hi team,

      when I use MatcheFiles using wildcard and files located in a GCP bucket, the MatcheFiles transform returns several times (at least 2) the same file.

      I have tried to follow the stack, and I can see that the MatchesAll is called twice when I run the pipeline on a debug project where a single element is present in the bucket.

      But I am not good enough to say more than that. Sorry.

      Best regards

      Jerome

      Attachments

        Activity

          People

            Unassigned Unassigned
            jerome.massot.78@gmail.com Jerome MASSOT
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: