[Moabusers] Jobs running outside the scheduler

Gelonia L Dent gdent at amnh.org
Fri Nov 21 11:37:30 MST 2008


I continue to see the problem of jobs appearing to be running outside of
the scheduler. I would appreciate some feedback how this might be
happening?

Here is some output from the current state of the scheduler:
*********************
showq

active jobs------------------------
JOBID              USERNAME      STATE PROCS   REMAINING            STARTTIME

5836                  taran    Running    32  3:01:55:23  Thu Nov 20 11:23:30
5748                  jware    Running     8  3:09:35:09  Wed Nov 12 11:00:16
5840                  jware    Running    12 11:12:29:05  Thu Nov 20 13:57:12

3 active jobs            52 of 128 processors in use by local jobs (40.62%)
                          13 of 32 nodes active      (40.62%)

- - - - - - --
Now looking at processes running on the system ( a snapshot of the
output). Notice the poy4.1 processes, which are not associated with any of
the scheduled jobs.

 R   500 12246 12154 99 -40   - - 85629 ghost_ ?        1-02:00:13 poy_4.1
1 S   500 12247 12246  0 -40   - -  7308 ghost_ ?        00:00:00 poy_4.1
1 S   530 12349 12219  0 -40   - -  2721 ghost_ ?        00:00:01 mpiexec
1 R   530 12350 11379 99 -40   - - 68069 ghost_ ?        9-01:43:23 mb
1 S     0 12520     1  0 -40   - -  1720 ghost_ ?        00:00:00 portmap
1 S     0 12715     1  0 -40   - -  1720 ghost_ ?        00:00:00 portmap
1 S   530 12794 16051  0 -40   - -  1394 ghost_ ?        00:00:00 bash
1 S   530 12795 12794  0 -40   - -   955 ghost_ ?        00:00:00 pbs_demux
1 S   530 12796 12794  0 -40   - -  1342 ghost_ ?        00:00:00
5840.scyld..SC
1 S   530 12798 12796  0 -40   - -  2754 ghost_ ?        00:00:00 mpiexec
1 S     0 12866     1  0 -40   - - 42859 ghost_ ?        00:00:00 nscd
1 S   530 12989 12798  0 -40   - -  2721 ghost_ ?        00:00:00 mpiexec
1 R   530 12990 16051 99 -40   - - 42048 ghost_ ?        23:30:28 mb
1 S     0 13332     1  0 -40   - -  3213 ghost_ ?        03:15:48 pbs_mom
1 S     0 13763     1  0 -40   - - 42859 ghost_ ?        00:00:00 nscd
1 S     0 13796     1  0 -40   - -  1720 ghost_ ?        00:00:00 portmap
1 S     0 13822     1  0 -40   - -  1720 ghost_ ?        00:00:00 portmap
1 S     0 13860     1  0 -40   - -  1720 ghost_ ?        00:00:00 portmap
1 S     0 14168     1  0 -40   - - 42859 ghost_ ?        00:00:00 nscd
1 S   517 14289     1  0 -40   - -  7308 ghost_ ?        00:00:00 poy_4.1
1 S   517 14291     1  0 -40   - -  7308 ghost_ ?        00:00:00 poy_4.1
1 S   517 14293     1  0 -40   - -  7308 ghost_ ?        00:00:00 poy_4.1
1 S   517 14295     1  0 -40   - -  7308 ghost_ ?        00:00:00 poy_4.1
1 S   517 14297     1  0 -40   - -  7308 ghost_ ?        00:00:00 poy_4.1
1 S   517 14300     1  0 -40   - -  7308 ghost_ ?        00:00:00 poy_4.1
1 S   517 14302     1  0 -40   - -  7308 ghost_ ?        00:00:00 poy_4.1
1 S   517 14304     1  0 -40   - -  7308 ghost_ ?        00:00:00 poy_4.1
1 S   517 14306     1  0 -40   - -  7308 ghost_ ?        00:00:00 poy_4.1

------------------

Is there anyway to prevent this spawning? What is the best way, besides me
contantly asking users, to clean this up?

I appreciate your help

GD




More information about the moabusers mailing list