1 # Copyright (c) 2002-2003
\r
2 # The President and Fellows of Harvard College.
\r
4 # $Id: EXAMPLES,v 1.3 2003/07/28 14:27:16 ellard Exp $
\r
6 # For nfsscan version 0.10a (dated 7/25/2003).
\r
10 The usual procedure for analyzing a trace is the following:
\r
12 1. Use nfsscan to produce a tabularized summary of each
\r
13 300-second segment of the trace. For these examples,
\r
14 we'll call this DEFAULT_TABLE.
\r
16 Depending on what you're looking for, the default
\r
17 settings of nfsscan might not provide all the info
\r
18 you're going to want in the next step. The default is
\r
19 to omit per-client, per-user, per-group, and per-file
\r
20 stats and only tally total operation counts. [THE
\r
21 DEFAULTS MAY CHANGE IN A FUTURE RELEASE.]
\r
23 2. Use ns_timeagg to create a summary of activity in the
\r
24 entire trace named SUMMARY_TABLE from DEFAULT_TABLE.
\r
26 Note that almost anything ns_timeagg and ns_split can
\r
27 do can also be done directly with nfsscan. However,
\r
28 the implicit goal of ns_timeagg and ns_split is to
\r
29 AVOID re-running nfsscan. It is much faster to
\r
30 re-process a table created by nfsscan than it is to
\r
31 re-create the table -- the input to nfsscan is
\r
32 typically several million (or billion) lines of trace
\r
33 data, while the output is usually only a few thousand
\r
36 3. Use ns_quickview to plot interesting aspects of the
\r
37 DEFAULT_TABLE and/or SUMMARY_TABLE.
\r
39 4. [optional] Use ns_split and/or ns_tsplit to isolate
\r
40 interesting parts of DEFAULT_TABLE (such as per-client
\r
41 or per-user counts). Repeat steps 2 and 3 with the
\r
44 5. [optional] If steps 2-4 found anything interesting, re-run
\r
45 nfsscan with new parameters to take a closer look at
\r
48 Examples and discussion of these steps and related topics is given
\r
51 For these examples, TRACE is a trace file gathered by nfsdump (or
\r
52 another tool that creates traces files in the same format), and TABLE.ns
\r
53 is a file created by nfsscan from TRACE. The suffix ".ns" is also
\r
54 used to denote files that contain tables created by nfsscan,
\r
55 ns_timeagg, ns_split, and ns_tsplit. Example commandlines always
\r
60 2. CREATING A SUMMARY TABLE
\r
62 To compute a table contsisting of a single row with counts for
\r
63 each operation tallied by the nfsscan run, aggregate over time
\r
64 with a time length of zero. (Zero is treated as a special
\r
65 time length that includes the entire input table.)
\r
67 % ns_timeagg -t0 TABLE.ns > SUMMARY.ns
\r
69 Note that timeagg will always aggregate over every (except
\r
70 time) attribute, so it does not matter whether or not the
\r
71 TABLE.ns contains per-client, per-user, per-group, or per-file
\r
72 data. The sum will always be the same.
\r
74 On the other hand, if you want to prevent ns_timeagg from
\r
75 aggregating over a particular attribute, specify that
\r
76 attribute in the same manner as with nfsscan. For example, to
\r
77 create a table with a single row containing the operation
\r
80 % ns_timeagg -t0 -BU TABLE.ns > SUMMARY.ns
\r
82 Of course, ns_timeagg cannot create data out of thin air. If
\r
83 TABLE.ns does not contain per-user information then -BU will
\r
86 3. PLOTTING THE DATA
\r
88 To simply plot the total operation count:
\r
90 % ns_quickview TABLE.ns
\r
93 WHICH CLIENT REQUESTS THE MOST OPERATIONS?
\r
95 Method: use nfsscan to tally the per-client operation counts for the
\r
96 entire trace file (by using -t0), and then sort by the TOTAL
\r
99 If TABLE contains per-client information, then this is easy:
\r
101 % ns_timeagg -t0 -BC TABLE | grep -v '^#' \
\r
102 | awk '{print $7, $3}' | sort -nr
\r
104 If TABLE does not contain per-client info, then it's necessary
\r
107 % nfsscan -t0 -BC TRACE | grep -v '^#' \
\r
108 | awk '{print $7, $3}' | sort -nr
\r
110 The output from either command is a two-column table. The
\r
111 first column is the total operation count of each client, and
\r
112 the second column is the ID of each client.
\r
114 WHICH CLIENT DOES THE MOST READING?
\r
116 If we've already got TABLE, and it contains per-client info,
\r
117 then the easiest way is to simply use extract the read count
\r
118 column (instead of the TOTAL column) from TABLE:
\r
120 % ns_timeagg -t0 -BC TABLE | grep -v '^#' \
\r
121 | awk '{print $9, $3}' | sort -nr
\r
123 Or, we can nfsscan. Because we're not interested in anything
\r
124 except the read count, we can change the list of operations
\r
125 that nfsscan tabulates so that it only counts reads. (Of course,
\r
126 the resulting table is useless for anything except answering
\r
127 this particular question, and since nfsscan is expensive to run
\r
128 this is probably wasteful.)
\r
130 % nfsscan -t0 -BC -Oread -i TRACE | grep -v '^#' \
\r
131 | awk '{print $7, $3}' | sort -nr
\r
133 WHICH CLIENT DOES THE MOST FSSTATS?
\r
135 fsstat is not ordinarily tabulated by nfsscan. To tell nfsscan
\r
136 to keep track of it, we can change the list of operations to consist
\r
139 % nfsscan -t0 -BC -Ofsstat -i TRACE | ...
\r
141 As mentioned in the previous example, it is often wasteful to
\r
142 run nfsscan just to get one number. Another approach is to
\r
143 add fsstat to the default list of "interesting" operations, by
\r
144 using "+" at the start of the operation list. This tells nfsscan
\r
145 to append the given list of operations to the default list:
\r
147 % nfsscan -t0 -BC -O+fsstat -i TRACE | ...
\r
149 An implication of this is that it's impossible to know what
\r
150 each column in the table represents unless you know what
\r
151 operations were considered "interesting" for each run of
\r
152 nfsscan. To help with this, nfsscan includes the commandline
\r
153 and column titles at the start of each file it creates.
\r
155 WHICH USER DOES THE MOST READING?
\r
157 This is exactly like the previous example, except that we use
\r
158 -BU instead -BC, to do everything per-user instead of
\r
161 WHAT DIRECTORIES ARE HOTTEST?
\r
163 Use the -d option to find the cummulative number of operations
\r
164 per directory, then sort the results by operation count. In
\r
165 order to avoid drowning in data you might choose to print
\r
166 print only the top 100:
\r
168 % nfsscan -i TRACE -t0 -d | grep '^D' \
\r
169 | awk '{print $7, $5}' | sort -nr | head -100
\r