Details
Description
Oozie is being adopted as default workflow/scheduling engine for BigData.
Currently, small organizations prefer on demand clusters like Amazon's EMR instead of full fledged Hadoop setup. However, currently we don't have support for powerful workflow engine like oozie, which seamlessly schedules/executes user jobs on EMR.
Oozie can provide a new ActionExecutor class like EMRActionExecutor, which can take all the required credentials for EMR.
Oozie can be installed on Amazon EC2 instance, which can then talk to any dynamic EMR cluster.
Though, Oozie has support for other filesystems other than HDFS, we might need to tweak a bit to support Filesystems like S3.