[SPARK-35739] [Spark Sql] Add Java-comptable Dataset.join overloads - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 2.0.0, 3.0.0
Fix Version/s: 3.4.0
Component/s: Java API, SQL
Labels:
None

Description

Problem

When using Spark SQL with Java, the required syntax to utilize the following two overloads are unnatural and not obvious to developers that haven't had to interoperate with Scala before:

def join(right: Dataset[_], usingColumns: Seq[String]): DataFrame
def join(right: Dataset[_], usingColumns: Seq[String], joinType: String): DataFrame

Examples:

Java 11

Dataset<Row> dataset1 = ...;
Dataset<Row> dataset2 = ...;

// Overload with multiple usingColumns, no join type
dataset1
  .join(dataset2, JavaConverters.asScalaBuffer(List.of("column", "column2))
  .show();

// Overload with multiple usingColumns and a join type
dataset1
  .join(
    dataset2,
    JavaConverters.asScalaBuffer(List.of("column", "column2")),
    "left")
  .show();

Additionally there is no overload that takes a single usingColumnn and a joinType, forcing the developer to use the Seq[String] overload regardless of language.

Examples:

Scala

val dataset1 :DataFrame = ...;
val dataset2 :DataFrame = ...;

dataset1
  .join(dataset2, Seq("column"), "left")
  .show();

Java 11

Dataset<Row> dataset1 = ...;
Dataset<Row> dataset2 = ...;

dataset1
 .join(dataset2, JavaConverters.asScalaBuffer(List.of("column")), "left")
 .show();

Proposed Improvement

Add 3 additional overloads to Dataset:

def join(right: Dataset[_], usingColumn: List[String]): DataFrame
def join(right: Dataset[_], usingColumn: String, joinType: String): DataFrame
def join(right: Dataset[_], usingColumn: List[String], joinType: String): DataFrame

Attachments

Issue Links

links to

[Github] Pull Request #33323 (brandondahler)

[Github] Pull Request #34923 (brandondahler)

[Github] Pull Request #36343 (brandondahler)

Activity

People

Assignee:: Brandon Dahler

Reporter:: Brandon Dahler

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 11/Jun/21 14:50

Updated:: 12/Dec/22 18:11

Resolved:: 28/Apr/22 23:53