Debian Clusters for Education and Research: The Missing Manual

MPICH: Troubleshooting the MPD

From Debian Clusters

Jump to: navigation, search

This is part five of a multi-part tutorial on installing and configuring MPICH2. The full tutorial includes

Missing .mpd.conf

If you try to start a daemon (using mpd) without an .mpd.conf file or with incorrect permissions on the file, , you'll get the following error.

gyrfalcon:/shared$ mpd
configuration file /shared/home/kwanous/.mpd.conf not found
A file named .mpd.conf file must be present in the user's home
directory (/etc/mpd.conf if root) with read and write access
only for the user, and must contain at least a line with:
MPD_SECRETWORD=<secretword>
One way to safely create this file is to do the following:
  cd $HOME
  touch .mpd.conf
  chmod 600 .mpd.conf
and then use an editor to insert a line like
  MPD_SECRETWORD=mr45-j9z
into the file.  (Of course use some other secret word than mr45-j9z.)

The error is explicit about what needs to be done to create an mpd.conf file. If you're running this as a user, create ~/.mpd.conf, or create /etc/mpd.conf for the root account. Create some secret word; the word will be used to distinguish your processes from other people's and to keep them separate. This word can be just about anything but standard password requirements (more than six characters long, containing at least one number and at least one letter) help make it more secure. Follow the instructions from the error message to insert this with the proper syntax and to change the permissions on the file. (If you don't change the permissions, you'll see something like

gyrfalcon:~$ mpd
configuration file /shared/home/kwanous/.mpd.conf is accessible by others
change permissions to allow read and write access only by you

Start mpd as a daemon in the background using

mpd --daemon

Without these arguments, on some systems you'll see an error like this:

kwanous@gyrfalcon:~$ mpd
gyrfalcon_53084 (mpd_sockpair 226): connect -2 Name or service not known
gyrfalcon_53084 (mpd_sockpair 233): connect error with -2 Name or service not known

Missing Root's /etc/mpd.conf

Sometimes you'll see an error like this:

osprey:~# mpdtrace -l
/shared/bin/mpdroot: open failed for root's mpd conf filempdtrace (__init__ 1171
): forked process failed; status=255

You'll get this message when running as root if /etc/mpd.conf/, root's version of ~/.mpd.conf, doesn't exist. Use the same syntax (show in the error above) for creating the root version as for creating a user version. Once you create it, if you don't change the permissions to only be readable by root, you'll see a more helpful error:

osprey:~# mpdtrace -l
configuration file /etc/mpd.conf is accessible by others
change permissions to allow read and write access only by you

Use chmod 600 /etc/mpd.conf to do this and it should work.

Python Error

As of this writing, mpd (the MPI daemon) is a python program and requires the python 2.4 binary in order to run. If you don't have python installed on the machine you're trying to use MPI with, you'll see an error like this:

eagle:~# /usr/bin/env: python2.4: No such file or directory

Fortunately, it's easy enough to fix. All the hosts you're going to use MPI with need to issue

apt-get install python2.4

If this still doesn't work, try uninstalling all versions with

apt-get remove --purge python2.4 python

running

apt-get autoremove

and then finally running the apt-get install again.

If you need to do this on all of your nodes, rather than sshing into each one and doing it individually, check out the Cluster Time-saving Tricks.

Personal tools