Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2721

Wrong output generated while loading bags as input

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.9.0, 0.9.2, 0.10.0, 0.11
    • 0.9.3, 0.11, 0.10.1
    • None
    • None
    • Reviewed

    Description

      A = LOAD '/user/pvivek/sample' as (id:chararray,mybag:bag{tuple(bttype:chararray,cat:long)});
      B = foreach A generate id,FLATTEN(mybag) AS (bttype, cat);
      C = order B by id;
      dump C;
      

      The above code generates wrong results when executed with Pig 0.10 and Pig 0.9
      The below is the sample input;

      ...LKGaHqg--	{(aa,806743)}
      ..0MI1Y37w--	{(aa,498970)}
      ..0bnlpJrw--	{(aa,806740)}
      ..0p0IIhbA--	{(aa,498971),(se,498995)}
      ..1VkGqvXA--	{(aa,805219)}
      

      I think the Pig optimizers are causing this issue.From the logs I can see that the $1 is pruned for the relation A.

      [main] INFO org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for A: $1

      One workaround for this is to disable -t ColumnMapKeyPrune.

      Attachments

        1. pig-2721-trunk-withtest_v1.patch
          5 kB
          Koji Noguchi
        2. pig-2721-trunk-notestyet.patch
          0.8 kB
          Koji Noguchi

        Activity

          People

            knoguchi Koji Noguchi
            vivekp Vivek Padmanabhan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: