Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-6149

Resolving gcp extra dependencies breaks ssl support when contacting google

Details

    • Patch

    Description

      It looks a bit like an oxymoron to me, but when fully resolving apache-beam using gcp extras dependencies, httplib2 is forced to be on a version that doesn't allow it to call google, and any pipeline using google services (I haven't checked others), fails.

      I have done the full back-tracing of the problem, let me try to explain my findings.

      A quick way to reproduce this, is by using pipenv to install all the dependencies. It will make sure to resolve sub-dependencies, pipenv install apache-beam[gcp], and then run python -c 'from google.cloud import bigquery;client=bigquery.Client(); list(client.list_projects())'. The error is the same when running a pipeline, but I kept it simple.

      It will throw an error like this one:

      /home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/auth/_default.py:66: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK. We recommend that most server applications use service accounts instead. If your application continues to use end user credentials from Cloud SDK, you might receive a "quota exceeded" or "API not enabled" error. For more information about service accounts, see https://cloud.google.com/docs/authentication/
        warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)
      Traceback (most recent call last):
        File "<string>", line 1, in <module>
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/iterator.py", line 218, in _items_iter
          for page in self._page_iter(increment=False):
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/iterator.py", line 247, in _page_iter
          page = self._next_page()
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/iterator.py", line 347, in _next_page
          response = self._get_next_page_response()
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/iterator.py", line 396, in _get_next_page_response
          query_params=params)
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/_http.py", line 299, in api_request
          headers=headers, target_object=_target_object)
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/_http.py", line 193, in _make_request
          return self._do_request(method, url, headers, data, target_object)
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/_http.py", line 223, in _do_request
          body=data)
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google_auth_httplib2.py", line 187, in request
          self._request, method, uri, request_headers)
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/auth/credentials.py", line 122, in before_request
          self.refresh(request)
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/oauth2/credentials.py", line 136, in refresh
          self._client_secret))
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/oauth2/_client.py", line 237, in refresh_grant
          response_data = _token_endpoint_request(request, token_uri, body)
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/oauth2/_client.py", line 106, in _token_endpoint_request
          method='POST', url=token_uri, headers=headers, body=body)
        File "/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google_auth_httplib2.py", line 119, in __call__
          raise exceptions.TransportError(exc)
      google.auth.exceptions.TransportError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)
      

      The reason why I think this problem hasn't been posted before is because people is ignoring pip's output, which clearly states that there are some dependenciy issues:

      Unable to find source-code formatter for language: text. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
      (bqssltest) javier@ffukn897:~/projects/spinoffs/bqssltest$ pip install 'apache-beam[gcp]==2.7.0'                                                                                             ...
      google-gax 0.15.16 has requirement future<0.17dev,>=0.16.0, but you'll have future 0.17.1 which is incompatible.
      gapic-google-cloud-pubsub-v1 0.15.4 has requirement oauth2client<4.0dev,>=2.0.0, but you'll have oauth2client 4.1.3 which is incompatible.
      googledatastore 7.0.1 has requirement httplib2<0.10,>=0.9.1, but you'll have httplib2 0.11.3 which is incompatible.
      googledatastore 7.0.1 has requirement oauth2client<4.0.0,>=2.0.1, but you'll have oauth2client 4.1.3 which is incompatible.
      proto-google-cloud-pubsub-v1 0.15.4 has requirement oauth2client<4.0dev,>=2.0.0, but you'll have oauth2client 4.1.3 which is incompatible.
      proto-google-cloud-datastore-v1 0.90.4 has requirement oauth2client<4.0dev,>=2.0.0, but you'll have oauth2client 4.1.3 which is incompatible.
      ...
      

      These warnings are caused by the version pinning in the GCP requirements, in specific googledatastore==7.0.1 has a direct requirement of httplib2 [required: >=0.9.1,<0.10, installed: 0.9.2]. There is another version pinning of httplib2 directly by apache-beam, but doesn't cause the problem because it's asking for <=0.11.3.

      I have no idea why googledatastore is pinned on that version, it seems that someone is aware of the problem with datastore as googledatastore==7.0.2 is released with just that constraint removed.

      The only thing missing here is to upgrade this line to use 7.0.2:

      https://github.com/apache/beam/blob/master/sdks/python/setup.py#L143

      Can anyone do it and release a minor version? From previous experience I know it's way faster to merge a PR by a long running collaborator than by someone random on the internet.

      Attachments

        Activity

          People

            Unassigned Unassigned
            txomon Javier Domingo Cansino
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: