[SPARK-24623] Hadoop - Spark Cluster - Python XGBoost - Not working in distributed mode - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Incomplete
Affects Version/s: 2.1.1
Fix Version/s: None
Component/s: Deploy
Labels:
- bulk-closed
Environment:

Hadoop - Hortonworks Cluster

Total Nodes - 18

Worker Nodes - 13

Description

We recently installed python on the Hadoop cluster with lot of data science python modules including xgboost , spicy , scikit learn , pandas
Using pyspark the data scientists are able to test there scoring models in the distributed mode on the Hadoop cluster. But with python - xgboost the pyspark job is not getting distributed and it is trying to run only on one instance.
we are trying to achieve the distributed mode when using python xgboost via pyspark.
It would be a great help if you can direct me on how to achieve this.

Thanks,
Abhishek

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Abhishek Reddy Chamakura

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 21/Jun/18 16:26

Updated:: 08/Oct/19 05:44

Resolved:: 08/Oct/19 05:44