Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6837

Add an equivalent to Crunch's Pair class

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: mrv2
    • Labels:

      Description

      Crunch has this great Pair class (https://crunch.apache.org/apidocs/0.14.0/org/apache/crunch/Pair.html) that saves one from constantly implementing composite writables. It seems silly that we still don't have an equivalent in MR.

      I would like to see a new class with the following API:

      package org.apache.hadoop.io;
      
      public class CompositeWritable<P extends WritableComparable, S extends WritableComparable> implements WritableComparable<CompositeWritable> {
        public CompositeWritable(P primary, S secondary);
        public P getPrimary();
        public void setPrimary(P primary);
        public S getSecondary();
        public void setSecondary(S secondary);
      
        // Return true if both primaries and both secondaries are equal
        public boolean equals(CompositeWritable o);
      
        // Return the primary's hash code
        public long hashCode();
      
        // Sort first by primary and then by secondary
        public int compareTo(CompositeWritable o);
      
        public void readFields(DataInput in);
        public void write(DataOutput out);
      }
      

      With such a class, implementing a secondary sort would mean just implementing a custom grouping comparator. That comparator could also be implemented as part of this JIRA:

      package org.apache.hadoop.io;
      
      public class CompositeGroupingComparator extends WritableComparator {
        ...
      }
      

      Or some such.

      Crunch also provides Tuple3, Tuple4, and TupleN classes, but I don't think we need to add equivalents. If someone really wants that capability, they can nest composite keys.

      Don't forget to add unit tests!

        Attachments

        1. MAPREDUCE-6837.002.patch
          9 kB
          Gézapeti
        2. MAPREDUCE-6837.001.patch
          9 kB
          Gézapeti

          Activity

            People

            • Assignee:
              gezapeti Gézapeti
              Reporter:
              templedf Daniel Templeton
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: