[BEAM-12388] Improve caching experience on InteractiveRunner with dataframes - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: P3
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: runner-py-interactive
Labels:
None

Description

Reusing the default label for to_pcollection when using the interactive runner results in caching errors when used with multiple pipelines:

{{Traceback (most recent call last):
File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/interactive_runner_test.py", line 389, in test_dataframes_with_multi_index_get_result
pd.testing.assert_series_equal(df_expected, ib.collect(deferred_df, n=10))
File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/utils.py", line 247, in run_within_progress_indicator
return func(*args, **kwargs)
File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/interactive_beam.py", line 579, in collect
recording = recording_manager.record([pcoll], max_n=n, max_duration=duration)
File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/recording_manager.py", line 433, in record
self._watch(pcolls)
File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/recording_manager.py", line 306, in _watch
for pcoll in to_pcollection(*watched_dataframes, always_return_tuple=True):
File "/home/srohde/Workdir/beam/sdks/python/apache_beam/dataframe/convert.py", line 196, in to_pcollection
new_results =

{p: extract_input(p) File "/home/srohde/Workdir/beam/sdks/python/apache_beam/transforms/ptransform.py", line 1086, in __ror__ return self.transform.__ror__(pvalueish, self.label) File "/home/srohde/Workdir/beam/sdks/python/apache_beam/transforms/ptransform.py", line 587, in __ror__ raise ValueError( ValueError: Mixing value from different pipelines not allowed.}

}

Attachments

Issue Links

links to

GitHub Pull Request #15146

GitHub Pull Request #15180

Activity

People

Assignee:: Unassigned

Reporter:: Sam Rohde

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 21/May/21 18:19

Updated:: 04/Jun/22 20:11

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

4h 50m