Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-12388

Improve caching experience on InteractiveRunner with dataframes

Details

    • Improvement
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • runner-py-interactive
    • None

    Description

      Reusing the default label for to_pcollection when using the interactive runner results in caching errors when used with multiple pipelines:

       

       

      {{Traceback (most recent call last):
      File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/interactive_runner_test.py", line 389, in test_dataframes_with_multi_index_get_result
      pd.testing.assert_series_equal(df_expected, ib.collect(deferred_df, n=10))
      File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/utils.py", line 247, in run_within_progress_indicator
      return func(*args, **kwargs)
      File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/interactive_beam.py", line 579, in collect
      recording = recording_manager.record([pcoll], max_n=n, max_duration=duration)
      File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/recording_manager.py", line 433, in record
      self._watch(pcolls)
      File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/recording_manager.py", line 306, in _watch
      for pcoll in to_pcollection(*watched_dataframes, always_return_tuple=True):
      File "/home/srohde/Workdir/beam/sdks/python/apache_beam/dataframe/convert.py", line 196, in to_pcollection
      new_results =

      {p: extract_input(p) File "/home/srohde/Workdir/beam/sdks/python/apache_beam/transforms/ptransform.py", line 1086, in __ror__ return self.transform.__ror__(pvalueish, self.label) File "/home/srohde/Workdir/beam/sdks/python/apache_beam/transforms/ptransform.py", line 587, in __ror__ raise ValueError( ValueError: Mixing value from different pipelines not allowed.}

      }

      Attachments

        Activity

          People

            Unassigned Unassigned
            rohdesam Sam Rohde
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 4h 50m
                4h 50m