Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.6.3, 2.3.0, 2.4.5
-
None
-
None
Description
We need to have Application Id from resource manager mapped to the specific spark sql query that got executed with respect to that application Id so that back tracing is possible.
For example : if i run a query using spark shell :
spark.sql("select dt.d_year,item.i_brand_id brand_id,item.i_brand brand,sum(ss_ext_sales_price) sum_agg from date_dim dt,store_sales,item where dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = item.i_item_sk and item.i_manufact_id = 436 and dt.d_moy=12 group by dt.d_year,item.i_brand,item.i_brand_id order by dt.d_year,sum_agg desc,brand_id limit 100").show();
When i see the event logs or the history server i don't see the query anywhere, but the query plan is there, so it becomes difficult to trace back what query actually got submitted. (if have to map it to the specific application Id on yarn)