Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3401

UNION on schema throws ExecException: ERROR 2055

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 0.11
    • None
    • grunt
    • None
    • local

    Description

      Hi, I get strange exception when trying to union two relations by schema.
      It works when one of relations doesn't have any records.
      It breaks when both relations are not empty.
      Here is a part of the code:

      lastEndPoints24h = LOAD '$lastEndPoints24h' USING org.apache.pig.piggybank.storage.avro.AvroStorage();
      describe lastEndPoints24h;
      dump lastEndPoints24h;
      lastEndPoints24hProj = FOREACH lastEndPoints24h GENERATE msisdn, ts,
                                                                     center_lon, center_lat,
                                                                     lac, cid, lon, lat, cell_type, is_active, azimuth, hpbw, max_dist,
                                                                     tile_id, zone_col, zone_row,
                                                                     is_end_point, end_point_type;
      describe lastEndPoints24hProj;
      dump lastEndPoints24hProj;
      
      unionOfPivotsAndLastEndPoints = UNION ONSCHEMA validPivotsProj, lastEndPoints24hProj;
      describe unionOfPivotsAndLastEndPoints;
      --dump unionOfPivotsAndLastEndPoints;
      
      groupedValidPivots = GROUP unionOfPivotsAndLastEndPoints BY msisdn;
      dump groupedValidPivots;
      

      Something bad happens when I try to access union result in relation unionOfPivotsAndLastEndPoints.

      I can say for sure that relation lastEndPoints24h is correctly opened.
      Here is a proof:

      2013-07-29 03:34:18,833 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: 
      
      HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
      2.0.0-cdh4.3.0	0.11.0-cdh4.3.0	ssa	2013-07-29 03:34:13	2013-07-29 03:34:18	UNKNOWN
      
      Success!
      
      Job Stats (time in seconds):
      JobId	Alias	Feature	Outputs
      job_local634744752_0006	lastEndPoints24h	MAP_ONLY	file:/tmp/temp-1898051886/tmp-1962855781,
      
      Input(s):
      Successfully read records from: "/home/ssa/devel/lololabs/analyt/some_analyt_case/src/test/resources/pig/route_pivot_preparator/test_2013_07_23/lastEndPoints24h.avro"
      
      Output(s):
      Successfully stored records in: "file:/tmp/temp-1898051886/tmp-1962855781"
      
      Job DAG:
      job_local634744752_0006
      

      And here is schema and dump for it's projection lastEndPoints24hProj:

      (79263332100,1374521131,37.553441893272755,55.880436657140294,7712,24316,37.5473,55.8792,OUTDOOR,true,75,60,1102,49646,469,410,true,JITTER_START)
      
      lastEndPoints24hProj: {msisdn: long,ts: long,center_lon: double,center_lat: double,lac: int,cid: int,lon: double,lat: double,cell_type: chararray,is_active: boolean,azimuth: int,hpbw: int,max_dist: int,tile_id: int,zone_col: int,zone_row: int,is_end_point: boolean,end_point_type: chararray}
      

      When this file is empty (one of test cases), script works correctly.
      When this file is not empty I do get

      2013-07-29 03:34:47,898 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias groupedValidPivots
      Details at logfile: /home/ssa/devel/lololabs/analyt/some_analyt_case/src/main/resources/pig/pig_1375054429131.log
      

      An exception from log file

      Pig Stack Trace
      ---------------
      ERROR 1066: Unable to open iterator for alias groupedValidPivots
      
      org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias groupedValidPivots
      	at org.apache.pig.PigServer.openIterator(PigServer.java:838)
      	at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
      	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
      	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
      	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
      	at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
      	at org.apache.pig.Main.run(Main.java:604)
      	at org.apache.pig.Main.main(Main.java:157)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
      Caused by: java.io.IOException: Job terminated with anomalous status FAILED
      	at org.apache.pig.PigServer.openIterator(PigServer.java:830)
      	... 12 more
      ================================================================================
      
      

      Any "touch" of union gives an error with test: "unable to open iterator for alias ..."

      Schemas are fully defined, field names do match. What's the problem?

      Attachments

        Activity

          People

            Unassigned Unassigned
            serega_sheypak Sergey
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: