Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-11266

Cannot use Python MongoDB connector with Atlas MongoDB

Details

    • Bug
    • Status: Triage Needed
    • P2
    • Resolution: Fixed
    • 2.25.0
    • 2.27.0
    • io-py-mongodb
    • Google Cloud Dataflow

    Description

      Cannot use the Python MongoDB connector with a managed Atlas instance. The current implementations makes use of splitVector which is a high-privilege function that cannot be assigned to any user in Atlas. Getting error:

      pymongo.errors.OperationFailure: not authorized on properties to execute command
       { splitVector: "properties.properties", keyPattern: { _id: 1 },
      ...

      BEAM-4567 addressed the same issue in Java connector.

      Proposed solution for Python is to add bucket_auto option for the connector which would configure it to use @bucketAuto MongoDB aggregation instead of splitVector command:

      pipeline | ReadFromMongoDB(uri='mongodb+srv://user:pwd@cluster0.mongodb.net',
                                 db='testdb',
                                 coll='input',
                                 bucket_auto=True)
      

      Attachments

        Issue Links

          Activity

            People

              yichi Yichi Zhang
              EugeneNikolaiev Eugene Nikolaiev
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 50m
                  2h 50m