I am using Apache Spark to run machine learning algorithms and other big data tasks. Previously, I was using spark cluster standalone mode running spark master and worker on the same machine. Now, I added multiple worker machines and due to a tight firewall, I have to edit the random port of worker. Can anyone help how to change random spark ports and tell me exactly what configuration file needs to be edited? I read the spark documentation and it says spark-defaults.conf
should be configured but I don't know how I can configure this file for particularly changing random ports of spark.
相关问题
- How to maintain order of key-value in DataFrame sa
- slurm: use a control node also for computing
- Spark on Yarn Container Failure
- In Spark Streaming how to process old data and del
- Filter from Cassandra table by RDD values
相关文章
- Livy Server: return a dataframe as JSON?
- SQL query Frequency Distribution matrix for produc
- How to filter rows for a specific aggregate with s
- How to name file when saveAsTextFile in spark?
- How to use relative line numbering universally in
- Spark save(write) parquet only one file
- Could you give me any clue Why 'Cannot call me
- Why does the Spark DataFrame conversion to RDD req
Update for Spark 2.x
Some libraries have been rewritten from scratch and many legacy
*.port
properties are now obsolete (cf. SPARK-10997 / SPARK-20605 / SPARK-12588 / SPARK-17678 / etc)For Spark 2.1, for instance, the port ranges on which the driver will listen for executor traffic are
spark.driver.port
andspark.driver.port
+spark.port.maxRetries
spark.driver.blockManager.port
andspark.driver.blockManager.port
+spark.port.maxRetries
And the port range on which the executors will listen for driver traffic and/or other executors traffic is
spark.blockManager.port
andspark.blockManager.port
+spark.port.maxRetries
The "maxRetries" property allows for running several Spark jobs in parallel; if the base port is already used, then the new job will try the next one, etc, unless the whole range is already used.
Source:
https://spark.apache.org/docs/2.1.1/configuration.html#networking
https://spark.apache.org/docs/2.1.1/security.html under "Configuring ports"
check here https://spark.apache.org/docs/latest/configuration.html#networking
In the "Networking" section, you can see some of the ports are by default random. You can set them to your choice like below: