What happens when you make a request to a remote Spark cluster?
1 min readNov 21, 2019
- The Spark app requests resources from a “master”. By default, the spark master is local.
- When you run dse pyspark or dse spark-submit, dse automatically asks the local node for the dse Resource Manager, which is the 7080 Spark UI. The default resource manager is local. By local, it means run everything in the same JVM, don’t request distributed resources. This means the local resources would not show up in the DSE Resource Manager (Spark UI).
- The Resource Manager spins up executors, which talk back to the submitting application (client mode only).
- To request resources from a remote DSE Cluster, you would use additional parameters:
dse pyspark --master dse://10.101.36.16 --conf spark.cassandra.connection.host=10.101.36.16
That should help you get a connection, and see the application within the Resource Manager / Spark UI.