How to deploy distributed Tensorflow Slim example on multiple nodes (CPUs only)
How to deploy distributed Tensorflow Slim example on multiple nodes (CPUs only) I have implemented and deployed a CNN training example on my cluster (multiple hosts/nodes)following the tutorial distributed TensorFlow tutorial. Now I want to run a Tensorflow Slim example in a cluster which contains a few hosts/nodes (CPUs only). In the example code of distributed TensorFlow tutorial, I can use --ps_hosts, --worker_host, --job_name to specify a cluster and a particular job type (ps or worker). --ps_hosts, --worker_host, --job_name However, in the train_image_classifier.py I did not find arguments via which I can specify the cluster and job name. Here is the tutorial for deploy TF slim:TF Slim Deploy . I was wondering if the current TF slim library support deploying a training job on multiple nodes. If yes, how to launch a distributed TF slim job on a cluster? It would be good if you could provide some example code/scripts, just like the code example in distributed TensorFlow tutorial. ...