Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
2.1.0, 2.2.1, 2.3.1
-
None
Description
Created a table using CTAS from csv to parquet.Parquet table generated numerous small files.tried alter table concatenate but it's not working as expected.
spark.sql("CREATE TABLE flight.flight_data(year INT, month INT, day INT, day_of_week INT, dep_time INT, crs_dep_time INT, arr_time INT, crs_arr_time INT, unique_carrier STRING, flight_num INT, tail_num STRING, actual_elapsed_time INT, crs_elapsed_time INT, air_time INT, arr_delay INT, dep_delay INT, origin STRING, dest STRING, distance INT, taxi_in INT, taxi_out INT, cancelled INT, cancellation_code STRING, diverted INT, carrier_delay STRING, weather_delay STRING, nas_delay STRING, security_delay STRING, late_aircraft_delay STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' stored as textfile")
spark.sql("load data local INPATH 'i:/2008/2008.csv' INTO TABLE flight.flight_data")
spark.sql("create table flight.flight_data_pq stored as parquet as select * from flight.flight_data")
spark.sql("create table flight.flight_data_orc stored as orc as select * from flight.flight_data")
pyspark.sql.utils.ParseException: u'\nOperation not allowed: alter table concatenate(line 1, pos 0)\n\n== SQL ==\nalter table flight_data.flight_data_pq concatenate\n^^^\n'
Tried on both orc and parquet format.It's not working.