Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3907

In-Built function COR does not work with any other numeric type other than double.

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.11.1
    • None
    • build, piggybank
    • None

    Description

      Apache pig provides in-built function 'COR' (correlation). COR is used to calculate the correlation between various variables.
      COR function does not work if we provide any variable of datatype int or long. We need to explicitly cast the variables as double in the pig script. Which is never a good idea on the UI end.

      I have tried to unit test the correlation function by supplying some int values and it fails to iterate the bag. Same is the case, when supplying some int,long and double variables as input parameters to the COR function. However, my unit test for doubles gives the correct output.
      I have also tried to run the script on Hadoop Cluster, it fails if we have any variable other than double.
      It shows the following error on Hadoop cluster:
      ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2999: Unexpected internal error. null
      or sometimes ERROR 1066: Unable to open iterator for alias aliasName. Backend error : null

      In the Java Code of COR function, it casts everything to double, which is correct.But in the computeAll(-,-) function, the cast on iterators to yield x and y does creates a problem.

      exact code :
      double x =(Double)iterator_x.next().get(0); // error when int or long
      double y =(Double)iterator_y.next().get(0); // error when int or long

      Solutions: could be overriding the method getArgToFuncMapping() and defining Various classes IntCOR, LongCOR,FloatCOR. As it is done for some other UDFs like VAR.

      Please, fix the issue in piggybank as well as in Built-in Library of Pig.
      I am using Apache pig 0.11.0

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            rishi.pandey Rishi Pandey

            Dates

              Created:
              Updated:

              Slack

                Issue deployment