VOV: Find and read journal files

You can read a journal file to analyze and understand the precursors that caused a file or job to be in a certain state.

The vovserver saves several kinds of files to record its activity. It's main log file is in .swd/logs/server* (list by date and choose the most recent). A new version of this log is created each time vovserver starts.

The journal files record the micro-events of the vovserver, which are sent to notify clients, like the GUI and monitors, -wl jobs in NC and others to maintain their up-to-date state. These files are stored in a daily log-rotated form under .swd/journals/YYYY.MM.DD.jrn.

In the current release, the journals are intended for machine consumption and are terse and cryptic. That said, these journals can be used for auditing and troubleshooting. Each event line is identified by a single character (see below).

The journals can be browsed with a CGI script that is provided here:

This script translates the events into human-readable form and enables you to follow only events that are related to a specific node or job, or you can follow all events. The events are arranged into groups by time-slice.

You can also use grep() with the ID of a job to find all the journal events related to that job. For most jobs, all the events will be in a single journal file. For long-running jobs, you may need to examine files from multiple days.

The following code excerpts are taken from the CGI script and show how to identify each kind of line in the journal file.  Each event consists of a subject that identifies the object and the action.

Here are the subject types:

set subjectMap(c) "slave"
set subjectMap(d) "newslave"
set subjectMap(f) "file"
set subjectMap(j) "job"
set subjectMap(l) "journal"
set subjectMap(n) "node"
set subjectMap(v) "server"
set subjectMap(r) "retrace"
set subjectMap(s) "set"
set subjectMap(t) "trace"

And here are the verb types; some verbs only apply to specific subjects.
set verbMap(a)   "attach"  ; # to a set, for example
set verbMap(c)   "create"
set verbMap(d)   "detach"
set verbMap(e)   "error"
set verbMap(f)   "forget"
set verbMap(h)   "change"
set verbMap(i)   "inform"
set verbMap(k)   "stuck"
set verbMap(l)   "schedule"
set verbMap(m)   "move"
set verbMap(n)   "conflict"
set verbMap(o)   "overflow"
set verbMap(p)   "stop"
set verbMap(q)   "dispatch"
set verbMap(r)   "retraceinfo"
set verbMap(s)   "save"
set verbMap(t)   "start"
set verbMap(u)   "deschedule"
set verbMap(w)   "stats"
set verbMap(x)   "connect"
set verbMap(y)   "disconnect"
set verbMap(P)   "reserve_stop"
set verbMap(R)   "grab_resources"
set verbMap(T)   "reserve_start"
set verbMap(Z)   "compress"

Did you find this article helpful?