|
| 1 | + The File System Construction Kit |
| 2 | + version 0.4 |
| 3 | + 12/3/98 |
| 4 | + |
| 5 | + Dominic Giampaolo |
| 6 | + dbg@be.com |
| 7 | + |
| 8 | + |
| 9 | +INTRODUCTION: |
| 10 | + |
| 11 | +Welcome to the File System Construction Kit! This is a software |
| 12 | +package that accompanies the book, Practical File System Design, which |
| 13 | +I wrote and is published by Morgan Kaufmann (ISBN 1558604979). |
| 14 | + |
| 15 | +This package is a very simple framework in which you can experiment |
| 16 | +with a working (but simple) file system implementation. The framework |
| 17 | +is designed so that you can go in and modify one part of it, such as |
| 18 | +how the used and free disk blocks are managed, and not have to touch |
| 19 | +the rest of the file system. And because the package creates its file |
| 20 | +system inside of a normal file on your hard disk, you don't have to |
| 21 | +have a spare disk or require special (root) privileges to run the |
| 22 | +program. The goal is that this package should provide a convenient |
| 23 | +test bed for trying out new file system ideas without having to go |
| 24 | +through the pain and difficulty of creating a real kernel based file |
| 25 | +system. The API is generic enough however that after an you debug |
| 26 | +your implementation within this framework it could be moved to a real |
| 27 | +kernel based file system for the BeOS or a Unix like operating system. |
| 28 | + |
| 29 | +This package contains several parts. There is the core file system |
| 30 | +implementation, a kernel-like interface (complete with a full vnode |
| 31 | +layer, etc) and several test programs that use the kernel-like api to |
| 32 | +manipulate the file system. The three programs included are: "makefs" |
| 33 | +which can create a file system, "tstfs" which is a simple stress test |
| 34 | +that creates, writes to, and deletes files, and "fsh", a file system |
| 35 | +shel that lets you interactively manipulate your file system. |
| 36 | + |
| 37 | + |
| 38 | +BUILDING IT: |
| 39 | + |
| 40 | +The package should compile and build right out of the box on most |
| 41 | +versions of Unix and the BeOS. It has been tested on the following |
| 42 | +systems: |
| 43 | + |
| 44 | + BeOS/PPC Release 4 |
| 45 | + BeOS/Intel Release 4 |
| 46 | + Solaris (sparc) 5.5.1 |
| 47 | + Solaris (x86) 2.6 |
| 48 | + FreeBSD (x86) 2.2.2 |
| 49 | + Linux (x86) 2.1.57 |
| 50 | + Irix 6.5 |
| 51 | + |
| 52 | +so it should be reasonably portable. To build it just type "make". |
| 53 | +The result of the build should be three programs, mkfs, tstfs and fsh. |
| 54 | +If your compiler doesn't like the flags "-O7" which is in the |
| 55 | +makefile, just change that to be -O3 or whatever you want (Irix users |
| 56 | +will have to change this). |
| 57 | + |
| 58 | + |
| 59 | +USING IT: |
| 60 | + |
| 61 | +To use the package you have to first create a file in which you will |
| 62 | +create a file system. By default the tools expect a file name called |
| 63 | +"big_file" On Unix you can create an empty file by doing this: |
| 64 | + |
| 65 | + dd if=/dev/zero of=big_file bs=524288 count=32 |
| 66 | + |
| 67 | +That will create a 16 megabyte file called "big_file" in the current |
| 68 | +directory. |
| 69 | + |
| 70 | +All the tools treat the file "big_file" like a raw disk device. |
| 71 | + |
| 72 | +After you have created "big_file" you need to initialize a file system |
| 73 | +in it. The tool "makefs" does this. If you just run makefs it will |
| 74 | +go ahead and initialize the file big_file with the sample file |
| 75 | +system. |
| 76 | + |
| 77 | +After initializing the file system, you can test it out with "fsh", |
| 78 | +the file system shell. Just run fsh and it will give you a prompt: |
| 79 | + |
| 80 | + fsh>> |
| 81 | + |
| 82 | +You can type "help" to find out what commands are available. Here is |
| 83 | +a summary of a a few of the more useful commands: |
| 84 | + |
| 85 | + dir - get an "ls -l" style directory listing |
| 86 | + make - create a file which you can then read and write to with "rd" |
| 87 | + and "wr" |
| 88 | + rd - read some data from the file. you can optionally specify |
| 89 | + how many bytes to read (default is 256). |
| 90 | + wr - write some data to a file. you can optionally specify |
| 91 | + how many bytes to write (default is 256). the data written |
| 92 | + is generated automatically. |
| 93 | + close - close the currently open file |
| 94 | + open - open the named file |
| 95 | + lat_fs - perform a test similar to the LmBench test "lat_fs" (ie. |
| 96 | + create and delete files of various sizes). |
| 97 | + create - create the number of file specified (default 100) |
| 98 | + rmall - remove all the files in a directory. optionally you can |
| 99 | + name a directory and it will only remove the files in that |
| 100 | + directory. |
| 101 | + cp - copy data to or from the file system and the host file |
| 102 | + system. the syntax is as follows: |
| 103 | + |
| 104 | + Copy data from the host file system into myfs: |
| 105 | + |
| 106 | + cp :host-file-name myfs-filename |
| 107 | + |
| 108 | + Copy data from myfs to the host file system: |
| 109 | + |
| 110 | + cp myfs-file-name :host-file-name |
| 111 | + |
| 112 | + The leading ":" indicates which file is the host file. |
| 113 | + NOTE: copying around inside of myfs is not supported. |
| 114 | + |
| 115 | + |
| 116 | +The last program, tstfs, is a simple stress test. Basically you just |
| 117 | +run it and it will go off and randomly create and delete files in the |
| 118 | +file system. After it is done there will be a bunch of files left |
| 119 | +over. You can then run fsh to look at what it created. |
| 120 | + |
| 121 | + |
| 122 | + |
| 123 | +A TOUR OF THE SOURCES: |
| 124 | + |
| 125 | +The API is not quite as it was described in the appendix of the book |
| 126 | +Practical File System Design. That's because I hadn't written the kit |
| 127 | +until after the book was published. It still follows the appendix |
| 128 | +pretty closely however. |
| 129 | + |
| 130 | +Here are the sources that you'll find. |
| 131 | + |
| 132 | +The test programs: |
| 133 | +----------------- |
| 134 | + makefs.c |
| 135 | + fsh.c |
| 136 | + tstfs.c |
| 137 | + |
| 138 | +The core file system code for "myfs": |
| 139 | +------------------------------------- |
| 140 | + bitmap.c |
| 141 | + bitvector.c |
| 142 | + dir.c |
| 143 | + dstream.c |
| 144 | + file.c |
| 145 | + inode.c |
| 146 | + io.c |
| 147 | + journal.c |
| 148 | + util.c |
| 149 | + mount.c |
| 150 | + |
| 151 | +The supporting infra-structure (vnode layer, disk cache, etc): |
| 152 | +-------------------------------------------------------------- |
| 153 | + cache.c |
| 154 | + initfs.c |
| 155 | + kernel.c |
| 156 | + rootfs.c |
| 157 | + |
| 158 | + |
| 159 | +Miscellaneous support routines and porting bits: |
| 160 | +------------------------------------------------ |
| 161 | + argv.c |
| 162 | + hexdump.c |
| 163 | + sl.c |
| 164 | + stub.c |
| 165 | + sysdep.c |
| 166 | + |
| 167 | +If you don't have "dd" for some reason: |
| 168 | +--------------------------------------- |
| 169 | + mkfile.c |
| 170 | + |
| 171 | + |
| 172 | + |
| 173 | +MODIFYING IT: |
| 174 | + |
| 175 | +Each of the major components of the file system (block management, |
| 176 | +inode management, data stream reading and writing, directory |
| 177 | +management, file management, etc) are broken out into individual |
| 178 | +source files. As long as you maintain the api defined by the |
| 179 | +corresponding header file, the rest of the file system should continue |
| 180 | +to work. So for example if you wanted to change how directories store |
| 181 | +their contents, you could go modify dir.c, change it as you see fit, |
| 182 | +recompile and the rest of the file system will continue to work. |
| 183 | + |
| 184 | +The master header file for file system data structures is myfs.h. |
| 185 | +Basically any data structure that you want to modify is in there. |
| 186 | + |
| 187 | + |
| 188 | +NOTES ABOUT THE IMPLEMENTATION: |
| 189 | + |
| 190 | +This is a very simple file system. It is definitely not fast and |
| 191 | +isn't intended to be a commercial quality file system. It is intended |
| 192 | +to be easy to go in and modify. As an example of how simple it is, |
| 193 | +every time a directory is modified, its entire contents are read into |
| 194 | +memory, the changes made and the entire contents written back out. I |
| 195 | +chose to do it this way so it would be easier to understand and |
| 196 | +modify. Clearly I didn't do it to be fast. |
| 197 | + |
| 198 | +The file system has a single super block, a simple used/free block |
| 199 | +bitmap, an i-node bitmap, and an inode table. The rest of the disk is |
| 200 | +for storing user data. |
| 201 | + |
| 202 | +The layout of the file system is as follows: |
| 203 | + |
| 204 | + +--------------------------------------------------------------- |
| 205 | + | super | block | inode | inode | user |
| 206 | + | block | bitmap | bitmap | table | data..... |
| 207 | + +--------------------------------------------------------------- |
| 208 | + |
| 209 | +Files store their data using a simple block list of direct, indirect |
| 210 | +and double indirect blocks. The data stream code (currently) does not |
| 211 | +support growing into the double-indirect blocks of a file (it's still |
| 212 | +on the to-do list). |
| 213 | + |
| 214 | +There is no journaling support currently. I will probably add this |
| 215 | +sometime later. It takes a bit of work and I wanted to get the |
| 216 | +package out as opposed to having it sit on my hard disk for another |
| 217 | +month or so. Currently there is no safe ordering for disk writes |
| 218 | +since when I implement journaling that won't matter anyway. |
| 219 | + |
| 220 | +There is no real locking done although that's not such an issue since |
| 221 | +it really isn't intended to be run in a multi-threaded environment. |
| 222 | +The vnode layer does implement correct locking although on a Unix |
| 223 | +system unless you fix stub.c, the locks don't really do much (except |
| 224 | +tell you if try to lock an already locked lock which should never |
| 225 | +happen on a single threaded system). The vnode layer is actually part |
| 226 | +of an early version of the BeOS vnode layer written by Cyril |
| 227 | +Meurillon. The real BeOS vnode layer is much larger and more complex |
| 228 | +of course. |
| 229 | + |
| 230 | +The cache code is the real Release 4 BeOS disk cache code. The cache |
| 231 | +code can support a real BFS style journal implementation so that |
| 232 | +should ease adding journaling support to myfs. |
| 233 | + |
| 234 | +Oh, I also have not implemented the myfs_rename function yet. That |
| 235 | +will be coming shortly. |
| 236 | + |
| 237 | + |
| 238 | +REPORTING BUGS: |
| 239 | + |
| 240 | +If you fix a bug in the package, I'd like to hear about it. I can't |
| 241 | +really help debug everyone's file system but if you discover a problem |
| 242 | +with the package or get it working on another OS, I'd like to hear |
| 243 | +about it. E-mail me at: dbg@be.com |
| 244 | + |
0 commit comments