Is it always the case that the Driver (as a program that runs the master node) must be on a master node ?
For example, if I setup the ec2 with one master and two workers, does my code that has the main must be executed from the master EC2 instance ?
If answer is NO, what would be the best way to set-up the system where the driver is outside the ec2's master node (lets say, Driver is ran from my computer, while Master and Workers are on EC2)? Do I always have to use the spark-submit, or can I do it from an IDE such as Eclipse or IntelliJ IDEA?
If answer is YES, what would be the best reference to learn more about it (since I need to provide some sort of a proof)?
Thank you kindly for your answer, references would be highly appreciated!