.\" @(#)sfs.1 2.1 97/10/23 .\" See DESCR.SFS file for restrictions .\" .\" create man page by running 'tbl sfs.1 | nroff -man > sfs.cat' .\" .TH SFS 1 "5 October 1994" .SH NAME SFS \- Network File System server benchmark program .SH SYNOPSIS .B sfs [ .B \-a access_pcnt ] [ .B \-A append_pcnt ] .br [ .B \-b blocksz_file ] [ .B \-B block_size ] [ .B \-d debug_level ] .br [ .B \-D dir_cnt ] [ .B \-f file_set_delta ] [ .B \-F file_cnt ] [ .B \-i ] [ .B \-l load ] .br [ .B \-m mix_file ] [ .B \-M prime_client_hostname ] .br [ .B \-N client_num ] [ .B \-p processes ] [ .B \-P ] [ .B \-Q ] .br [ .B \-R biod_reads ] [ [ .B \-S symlink_cnt ] [ .B \-t time ] [ .B \-T ] .br [ .B \-V ] [ .B \-W biod_writes ] [ .B \-w warmup_time ] [ .B \-z ] .br [ .B server:/directory ... ] .LP .B sfs3 [ .B \-a access_pcnt ] [ .B \-A append_pcnt ] .br [ .B \-b blocksz_file ] [ .B \-B block_size ] [ .B \-d debug_level ] .br [ .B \-D dir_cnt ] [ .B \-f file_set_delta ] [ .B \-F file_cnt ] [ .B \-i ] [ .B \-l load ] .br [ .B \-m mix_file ] [ .B \-M prime_client_hostname ] .br [ .B \-N client_num ] [ .B \-p processes ] [ .B \-P ] [ .B \-Q ] .br [ .B \-R biod_reads ] [ [ .B \-S symlink_cnt ] [ .B \-t time ] [ .B \-T ] .br [ .B \-V ] [ .B \-W biod_writes ] [ .B \-w warmup_time ] [ .B \-z ] .br [ .B server:/directory ... ] .SH DESCRIPTION Normally, .B SFS is executed via the .B sfs_mgr script which controls .B SFS execution on one or more .SM NFS client systems using a single user interface. .P .B SFS is a Network File System server benchmark. It runs on one or more .SM NFS client machines to generate an artificial load consisting of a particular mix of .SM NFS operations on a particular set of files residing on the server being tested. The benchmark reports the average response time of the .SM NFS requests in milliseconds for the requested target load. The response time is the dependent variable. Load can be generated for a specific amount of time, or for a specific number of .SM NFS calls. .P .B SFS can also be used to characterize server performance. Nearly all of the major factors that influence NFS performance can be controlled using .B SFS command line arguments. Normally however, only the .B \-l load option used. Other commonly used options include the .B \-t time , .B \-p processes , .B \-m mix_file options. The remaining options are used to adjust specific parameters that affect .SM NFS performance. If these options are used, the results produced will be non\-standard, and thus, will not be comparable to tests run with other option settings. .P Normally, .B SFS is used as a benchmark program to measure the .SM NFS performance of a particular server machine at different load levels. In this case, the preferred measurement is to make a series of benchmark runs, varying the load factor for each run in order to produce a performance/load curve. Each point in the curve represents the server's response time at a specific load. .P .B SFS generates and transmits .SM NFS packets over the network to the server directly from the benchmark program, without using the client's local NFS service. This reduces the effect of the client NFS implementation on results, and makes comparison of different servers more client-independent. However, not all client implementation effects have been eliminated. Since the benchmark does by-pass much of the client NFS implementation (including operating system level data caching and write behind), .B SFS can only be used to measure server performance. .P Although .B SFS can be run between a single client and single server pair of machines, a more accurate measure of server performance is obtained by executing the benchmark on a number of clients simultaneously. Not only does this present a more realistic model of .SM NFS usage, but also improves the chances that maximum performance is limited by a lack of resources on the server machine, rather than on a single client machine. .P In order to facilitate running .B SFS on a number of clients simultaneously, an accompanying program called .B sfs_mgr provides a mechanism to run and synchronize the execution of multiple instances of .B SFS spread across multiple clients and multiple networks all generating load to the same .SM NFS server. In general, .B sfs_mgr should be used to run both single- and multiple-client tests. .P .B SFS employs a number of sub\-processes, each with its own test directory named .B ./testdir (where is a number from 0 to .B processes \- 1.) A standard set of files is created in the test directory. .P If multiple .B directories are specified on the command line, the .B SFS processes will be evenly distributed among the directories. This will produce a balanced load across each of the directories. .P The mix of .SM NFS operations generated can be set from a user defined mix file. The format of the file consists of a simple format, the first line contains the string "SFS MIXFILE VERSION 2" followed by each line containing the operation name and the percentage (eg. "write 12"). The total percentages must equal 100. Operations with not specified will never be called by .B SFS. .SH SFS OPTIONS .TP .B \-a access_pcnt The percentage of I/O files to access. The access percent can be set from 0 to 100 percent. The default is 10% access. .TP .B \-A append_pcnt The percentage of write operations that append data to files rather than overwriting existing data. The append percent can be set from 0 to 100 percent. The default is 70% append. .TP .B \-b blocksz_file The name of a file containing a block transfer size distribution specification. The format of the file and the default values are discussed below under "Block Size Distribution". .TP .B \-B block_size The maximum number of Kilo-bytes (KB) contained in any one data transfer block. The valid range of values is from 1 to 8 KB. The default maximum is 8 KB per transfer. .TP .B \-d debug_level The debugging level, with higher values producing more debugging output. When the benchmark is executed with debugging enabled, the results are invalid. The debug level must be greater than zero. .TP .B \-D dir_cnt The number of files in each directory, the number of directories varies with load load. The default is 30 files per directory. .TP .B \-f file_set_delta The percentage change in file set size. The change percent can be set from 0 to 100 percent. The default is 10% append. .TP .B \-F file_cnt The number of files to be used for read and write operations. The file count must be greater than 0. The default is 100 files. .TP .B \-i Run the test in interactive mode, pausing for user input before starting the test. The default setting is to run the test automatically. .TP .B \-l load The number of NFS calls per second to generate from each client machine. The load must be greater than the number of processes (see the "\-p processes" option). The default is to generate 60 calls/sec on each/client. .TP .B \-m mix_file The name of a file containing the operation mix specification. The format of the file and the default values are discussed below under "Operation Mix". .TP .B \-p processes The number of processes used to generate load on each client machine. The number of processes must be greater than 0. The default is 4 processes per client. .TP .B \-P Populate the test directories and then exit; don't run the test. This option can be used to examine the file set that the benchmark creates. The default is to populate the test directories and then execute the test automatically. .TP .B \-Q Run NFS over TCP. The default is UDP. .TP .B \-R biod_reads The maximum number of asynchronus reads issued to simulate biod behavior. The default is 2. .TP .B \-S symlink_cnt The number of symbolic links to be used for symlink operations. The symbolic link count must be greater than 0. The default is 20 symlinks. .TP .B \-t time The number of seconds to run the benchmark. The number of seconds must be greater than 0. The default is 300 seconds. (Run times less than 300 seconds may produce invalid results.) .TP .B \-T op_num Test a particular .SM NFS operation by executing it once. The valid range of operations is from 1 to 23. These values correspond to the procedure number for each operation type as defined in the .SM NFS protocol specification. The default is to run the benchmark, with no preliminary operation testing. .TP .B \-V Validate the correctness of the server's .SM NFS implementation. The option verifies the correctness of .SM NFS operations and data copies. The verification takes place immediately before executing the test, and does not affect the results reported by the test. The default is not to verify server .SM NFS operation before beginning the test. .TP .B \-z Generate raw data dump of the individual data points for the test run. .TP .B \-w warmup The number of seconds to generate load before starting the timed test run. The goal is to reach a steady state and eliminate any variable startup costs, before beginning the test. The warm up time must be greater than or equal to 0 seconds. The default is a 60 second warmup period. .TP .B \-W biod_writes The maximum number of asynchronus writes issued to simulate biod behavior. The default is 2. .SH MULTI-CLIENT OPTIONS .B SFS also recognizes options that are only used when executing a multi-client test. These options are generated by .B sfs_mgr and should not be specified by an end-user. .TP .B \-M prime_client_hostname The hostname of the client machine where a multi-client test is being controlled from. This machine is designated as the "prime client". The prime client machine may also be executing the .B SFS load-generating code. There is no default value. .TP .B \-N client_num The client machine's unique identifier within a multi-client test, assigned by the .B sfs_mgr script. There is no default value. .\".TP .\".B \-R random_number_seed .\"The value used by the client to index into a table of random number seeds. .\"There is no default value. .SH OPERATION MIX The .B SFS default mix of operations for version 2 is: .sp .TS center; l l l l l l n n n n n n l l l l l l n n n n n n l l l l l l n n n n n n. null getattr setattr root lookup readlink 0% 26% 1% 0% 36% 7% read wrcache write create remove rename 14% 7% 1% 1% 0% 0% link symlink mkdir rmdir readdir fsstat 0% 0% 0% 0% 6% 1% .TE .LP The .B SFS default mix of operations for version 3 is: .sp .TS center; l l l l l l n n n n n n l l l l l l n n n n n n l l l l l l n n n n n n l l l l l l n n n n n n. null getattr setattr lookup access readlink 0% 11% 1% 27% 7% 7% read write create mkdir symlink mknod 18% 9% 1% 0% 0% 0% remove rmdir rename link readdir readdirplus 1% 0% 0% 0% 2% 9% fsstat fsinfo pathconf commit 1% 0% 0% 5% .TE .P The format of the file consists of a simple format, the first line contains the string "SFS MIXFILE VERSION 2" followed by each line containing the operation name and the percentage (eg. "write 12"). The total percentages must equal 100. .SH FILE SET The default basic file set used by .B SFS consists of regular files varying in size from 1KB to 1MB used for read and write operations, and 20 symbolic links used for symbolic link operations. In addition to these, a small number of regular files are created and used for non-I/O operations (eg, getattr), and a small number of regular, directory, and symlink files may be added to this total due to creation operations (eg, mkdir). .P While these values can be controlled with command line options, some file set configurations may produce invalid results. If there are not enough files of a particular type, the specified mix of operations will not be achieved. Too many files of a particular type may produce thrashing effects on the server. .SH BLOCK SIZE DISTRIBUTION The block transfer size distribution is specified by a table of values. The first column gives the percent of operations that will be included in a any particular specific block transfer. The second column gives the number of blocks units that will be transferred. Normally the block unit size is 8KB. The third column is a boolean specifying whether a trailing fragment block should be transferred. The fragment size for each transfer is a random multiple of 1 KB, up to the block size - 1 KB. Two tables are used, one for read operation and one for write operations. The following tables give the default distributions for the read and write operations. .sp .TS center; c s s s c s s s r r r r r r r r c s s s n n n r. Read - Default Block Transfer Size Distribution Table resulting transfer percent block count fragment (8KB block size) 0 0 0 0% 0 - 7 KB 85 1 0 85% 8 - 15 KB 8 2 1 8% 16 - 23 KB 4 4 1 4% 32 - 39 KB 2 8 1 2% 64 - 71 KB 1 16 1 1% 128 - 135 KB .TE .sp 2 .TS center; c s s s c s s s r r r r r r r r c s s s n n n r. Write - Default Block Transfer Size Distribution Table resulting transfer percent block count fragment (8KB block size) 49 0 1 49% 0 - 7 KB 36 1 1 36% 8 - 15 KB 8 2 1 8% 16 - 23 KB 4 4 1 4% 32 - 39 KB 2 8 1 2% 64 - 71 KB 1 16 1 1% 128 - 135 KB .TE .P A different distribution can be substituted by using the '-b' option. The format for the block size distribution file consists of the first three columns given above: percent, block count, and fragment. Read and write distribution tables are identified by the keywords "Read" and "Write". An example input file, using the default values, is given below: .sp .TS l s s n n n. Read 0 0 0 85 1 0 8 2 1 4 4 1 2 8 1 1 16 1 .TE .TS l s s n n n. Write 49 0 1 36 1 1 8 2 1 4 4 1 2 8 1 1 16 1 .TE .P A second aspect of the benchmark controlled by the block transfer size distribution table is the network data packet size. The distribution tables define the relative proportion of full block packets to fragment packets. For instance, the default tables have been constructed to produce a specific distribution of ethernet packet sizes for i/o operations by controlling the amount of data in each packet. The write packets produced consist of 50% 8-KB packets, and 50% 1-7 KB packets. The read packets consist of 90% 8-KB packets, and 10% 1-7 KB packets. These figures are determined by multiplying the percentage of type of transfer times the number of blocks and fragments generated, and adding the totals. These computations are performed below for the default block size distribution tables: .sp .TS c c c c c c c c c c n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n r r r l l r r r n n r r r l l. Read total total percent blocks fragments blocks fragments 0 0 0 0 0 85 1 0 85 0 8 2 1 16 8 4 4 1 16 4 2 8 1 16 2 1 16 1 16 1 ---- ----- 149 15 90% 10% .TE .sp 3 .TS r r r r r r r r r r n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n r r r l l r r r n n r r r l l. Write total total percent blocks fragments blocks fragments 49 0 1 0 49 36 1 1 36 36 8 2 1 16 8 4 4 1 16 4 2 8 1 16 2 1 16 1 16 1 ---- ------ 100 100 50% 50% .TE .SH USING SFS As with all benchmarks, .B SFS can only provide numbers that are useful if the test runs are set up carefully. Since it measures server performance, the client (or clients) should not limit throughput. The goal is to determine how well the server performs. Most tests involving a single client will be limited by the client's ability to generate load, not by the server's ability to handle more load. Whether this is the case can be determined by running the benchmark at successively greater load levels and finding the "knee of the curve" at which load level the response time begins to increase rapidly. Having found the knee of the curve, measurements of CPU utilization, disk i/o rates, and network utilization levels should be made in order to determine whether the performance bottleneck is due to the client or server. .P For the results reported by .B SFS to be meaningful, the tests should be run on an isolated network, and both the client and server should be as quiescent as possible during tests. .P High error rates on either the client or server can also cause delays due to retransmissions of lost or damaged packets. .B netstat(8) can be used to measure the network error and collision rates on the client and server. Also .B SFS reports the number of timed-out .SM RPC calls that occur during the test as bad calls. If the number of bad calls is too great, or the specified mix of operations is not achieved, .B SFS reports that the test run is "Invalid". In this case, the reported results should be examined to determine the cause of the errors. .P To best simulate the effects of .SM NFS clients on the server, the test directories should be set up so that they are on at least two disk partitions exported by the server. .SM NFS operations tend to randomize disk access, so putting all of the .B SFS test directories on a single partition will not show realistic results. .P On all tests it is a good idea to run the tests repeatedly and compare results. If the difference between runs is large, the run time of the test should be increased until the variance in milliseconds per call is acceptably small. If increasing the length of time does not help, there may be something wrong with the experimental setup. .P The numbers generated by .B SFS are only useful for comparison if the test setup on the client machine is the same across different server configurations. Changing the .B processes or .B mix parameters will produce numbers that can not be meaningfully compared. Changing the number of generator processes may affect the measured response time due to context switching or other delays on the client machine, while changing the mix of .SM NFS operations will change the whole nature of the experiment. Other changes to the client configuration may also effect the comparability of results. .P To do a comparison of different server configurations, first set up the client test directory and do .B SFS runs at different loads to be sure that the variability is reasonably low. Second, run .B SFS at different loads of interest and save the results. Third, change the server configuration (for example, add more memory, replace a disk controller, etc.). Finally, run the same .B SFS loads again and compare the results. .SH SEE ALSO .P The benchmark .B README file contains many pointers to other files which provide information concerning SFS. .SH ERROR MESSAGES .TP 10 .B "illegal load value" The .B load argument following the .B \-l flag on the command line is not a positive number. .TP .B "illegal procs value" The .B processes argument following the .B \-p flag on the command line is not a positive number. .TP .B "illegal time value" The .B time argument following the .B \-t flag on the command line is not a positive number. .TP .B "bad mix file" The .B mix file argument following the .B \-m flag on the command line could not be accessed. .TP .B "can't fork" The parent couldn't fork the child processes. This usually results from lack of resources, such as memory or swap space. .TP .PD 0 .B "can't open log file" .TP .B "can't stat log" .TP .B "can't truncate log" .TP .B "can't write sync file" .TP .B "can't write log" .TP .B "can't read log" .PD A problem occurred during the creation, truncation, reading or writing of the synchronization log file. The parent process creates the log file in /tmp and uses it to synchronize and communicate with its children. .TP .PD 0 .B "can't open test directory" .TP .B "can't create test directory" .TP .B "can't cd to test directory" .TP .B "wrong permissions on test dir" .TP .B "can't stat testfile" .TP .B "wrong permissions on testfile" .TP .B "can't create rename file" .TP .B "can't create subdir" .PD A child process had problems creating or checking the contents of its test directory. This is usually due to a permission problem (for example the test directory was created by a different user) or a full file system. .TP .PD 0 .B "op failed: " One of the internal pseudo\-NFS operations failed. The name of the operation, e.g. read, write, lookup, will be printed along with an indication of the nature of the failure. .TP .B "select failed" The select system call returned an unexpected error. .SH BUGS .P .B SFS can not be run on non\-NFS file systems. .P .P Shell scripts that execute .B SFS must catch and ignore SIGUSR1, SIGUSR2, and SIGALRM, (see signal(3)). These signals are used to synchronize the test processes. If one of these signals is not caught, the shell that is running the script will be killed. .SH FILES .PD 0 .TP .B ./testdir* per process test directory .TP .B /tmp/sfs_log%d child process synchronization file .TP .B /tmp/sfs_CL%d client log file .TP .B /tmp/sfs_PC_sync prime client log file .TP .B /tmp/sfs_res prime results log file .PD