|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +title: "Bring TensorBoard to MXNet" |
| 4 | +date: 2017-01-07 00.00.00 -0800 |
| 5 | +author: Zihao Zheng |
| 6 | +--- |
| 7 | + |
| 8 | +With proper visualization, we could have a better understanding of the mechanism of Deep Learning. |
| 9 | + |
| 10 | + * Monitoring training/testing metrics through the learning curve. You know how well your model learning from the data. |
| 11 | + * Visualizing the dynamics under the gradients of a layer with histogram, you know your network is live or dead(gradient). |
| 12 | + * By interpreting the embedding features from a layer, using t-SNE for high-dimensional data visualization, you get intuitions from its representation power. |
| 13 | + |
| 14 | +There are way more techniques in visualizing neural networks than the above. So that’s why we want to build a handy tool for our MXNet users. |
| 15 | + |
| 16 | +Thanks to the community, we already have [TensorBoard](https://www.tensorflow.org/versions/master/how_tos/graph_viz/index.html) and it’s easy-to-use and meets most daily use cases. However, TensorBoard is built together with TensorFlow and we have to come up a way to make a stand-alone version for general visualization purpose. |
| 17 | + |
| 18 | +## Before we start |
| 19 | +It's my first time to get involved in an open-sourced project like MXNet, and it has some visualization solutions there. I found there’re several similar issues requests for TensorBoard-liked tool, some people want to build the tool from scratch, while [@piiswrong](https://github.com/piiswrong) asked whether is possible to strip TensorBoard from TensorFlow. I like the latter one, so I created an issue [dmlc/mxnet#4003](https://github.com/dmlc/mxnet/issues/4003) for discussion, proposed my solution and roadmap towards this direction. |
| 20 | + |
| 21 | +## The Logging Part |
| 22 | +Technically, TensorBoard contains two parts: logging and rendering. In TensorBoard, it supports these types of data: |
| 23 | + |
| 24 | +* Scalar. |
| 25 | +* Image. |
| 26 | +* Video. |
| 27 | +* Histogram. |
| 28 | +* Graph. The TensorFlow computational graph. |
| 29 | +* Embedding. |
| 30 | + |
| 31 | +### Get summary without running TensorFlow |
| 32 | + |
| 33 | +In TensorFlow, `summary` object could be generated by running a `session` or by running an operation, here's an example in [TensorBoard Document](https://www.tensorflow.org/how_tos/summaries_and_tensorboard/) |
| 34 | + |
| 35 | +```python |
| 36 | +with tf.name_scope('cross_entropy'): |
| 37 | + # The raw formulation of cross-entropy, |
| 38 | + # |
| 39 | + # tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.softmax(y)), |
| 40 | + # reduction_indices=[1])) |
| 41 | + # |
| 42 | + # can be numerically unstable. |
| 43 | + # |
| 44 | + # So here we use tf.nn.softmax_cross_entropy_with_logits on the |
| 45 | + # raw outputs of the nn_layer above, and then average across |
| 46 | + # the batch. |
| 47 | + diff = tf.nn.softmax_cross_entropy_with_logits(y, y_) |
| 48 | + with tf.name_scope('total'): |
| 49 | + cross_entropy = tf.reduce_mean(diff) |
| 50 | +tf.summary.scalar('cross_entropy', cross_entropy) |
| 51 | +``` |
| 52 | + |
| 53 | +Luckily, those summaries are [Protocol Buffers](https://developers.google.com/protocol-buffers/) and it makes things easy, which means it could be changed into pure Python. Thanks to the community, I found [@mufeili](https://github.com/mufeili) has been working on this feature and TensorBoard has provided a relatively clean API for this purpose, which allows us to get these without running TensorFlow’s operation. |
| 54 | + |
| 55 | +The logging part code is placed in [tensorboard/python](https://github.com/dmlc/tensorboard/tree/master/python). |
| 56 | + |
| 57 | +### Logging events in pure Python |
| 58 | + |
| 59 | +TensorBoard relies on the `Event` file, in which `summary` information is included in this file. |
| 60 | + |
| 61 | +```bash |
| 62 | +$ tensorboard --logdir=path-to-event-files |
| 63 | +``` |
| 64 | + |
| 65 | +See [tensorboard/record_writer.py](https://github.com/dmlc/tensorboard/blob/master/python/tensorboard/record_writer.py) for details. |
| 66 | + |
| 67 | +## The Rendering Part |
| 68 | +The next step is to build TensorBoard's rendering part. The goal is to provide an easy-to-maintain solution, so we didn’t try to strip the relevant codes and build it from scratch. Rather, we pull the TensorFlow codebase and use bazel to build TensorBoard: |
| 69 | + |
| 70 | +```bash |
| 71 | +$ git clone https://github.com/tensorflow/tensorflow |
| 72 | +$ cd tensorflow |
| 73 | +$ ./configure |
| 74 | +$ bazel build tensorflow/tensorboard:tensorboard |
| 75 | +``` |
| 76 | + |
| 77 | +Then all dependencies could be found in `/bazel-bin/tensorflow/tensorboard`, a Python binary `tensorboard` to launch the app and its dependencies `tensorboard.runfiles`. |
| 78 | + |
| 79 | +Here’s the file structure before we build TensorBoard(and move files): |
| 80 | + |
| 81 | +``` |
| 82 | +├── LICENSE |
| 83 | +├── Makefile |
| 84 | +├── README.md |
| 85 | +├── installer.sh |
| 86 | +├── python |
| 87 | +│ ├── README.md |
| 88 | +│ ├── setup.py |
| 89 | +│ └── tensorboard |
| 90 | +│ ├── __init__.py |
| 91 | +│ ├── crc32c.py |
| 92 | +│ ├── event_file_writer.py |
| 93 | +│ ├── record_writer.py |
| 94 | +│ ├── src |
| 95 | +│ │ └── __init__.py |
| 96 | +│ ├── summary.py |
| 97 | +│ └── writer.py |
| 98 | +├── tensorboard |
| 99 | +│ └── src |
| 100 | +│ ├── event.proto |
| 101 | +│ ├── resource_handle.proto |
| 102 | +│ ├── summary.proto |
| 103 | +│ ├── tensor.proto |
| 104 | +│ ├── tensor_shape.proto |
| 105 | +│ └── types.proto |
| 106 | +└── tools |
| 107 | + └── pip_package |
| 108 | + ├── MANIFEST.in |
| 109 | + ├── README |
| 110 | + └── build_pip_package.sh |
| 111 | +``` |
| 112 | + |
| 113 | +Then we get `tensorboard` and `tensorboard.runfiles/`: |
| 114 | + |
| 115 | +``` |
| 116 | +├── LICENSE |
| 117 | +├── Makefile |
| 118 | +├── README.md |
| 119 | +├── installer.sh |
| 120 | +├── python |
| 121 | +│ ├── MANIFEST.in |
| 122 | +│ ├── README |
| 123 | +│ ├── README.md |
| 124 | +│ ├── setup.py |
| 125 | +│ └── tensorboard |
| 126 | +│ ├── __init__.py |
| 127 | +│ ├── crc32c.py |
| 128 | +│ ├── event_file_writer.py |
| 129 | +│ ├── record_writer.py |
| 130 | +│ ├── src |
| 131 | +│ ├── summary.py |
| 132 | +│ ├── tensorboard <--- python binary |
| 133 | +│ ├── tensorboard.runfiles <--- directory/dependencies |
| 134 | +│ └── writer.py |
| 135 | +├── tensorboard |
| 136 | +│ └── src |
| 137 | +│ ├── event.proto |
| 138 | +│ ├── resource_handle.proto |
| 139 | +│ ├── summary.proto |
| 140 | +│ ├── tensor.proto |
| 141 | +│ ├── tensor_shape.proto |
| 142 | +│ └── types.proto |
| 143 | +└── tools |
| 144 | + └── pip_package |
| 145 | + ├── MANIFEST.in |
| 146 | + ├── README |
| 147 | + └── build_pip_package.sh |
| 148 | +``` |
| 149 | + |
| 150 | +The most important, the `tensorboard` binary is directly usable, as it searches and imports relevant packages: |
| 151 | + |
| 152 | +```python |
| 153 | +# Find the runfiles tree |
| 154 | +def FindModuleSpace(): |
| 155 | + # Follow symlinks, looking for my module space |
| 156 | + stub_filename = os.path.abspath(sys.argv[0]) |
| 157 | + while True: |
| 158 | + # Found it? |
| 159 | + module_space = stub_filename + '.runfiles' |
| 160 | + if os.path.isdir(module_space): |
| 161 | + break |
| 162 | + |
| 163 | + runfiles_pattern = "(.*\.runfiles)/.*" |
| 164 | + if IsWindows(): |
| 165 | + runfiles_pattern = "(.*\.runfiles)\\.*" |
| 166 | + matchobj = re.match(runfiles_pattern, os.path.abspath(sys.argv[0])) |
| 167 | + if matchobj: |
| 168 | + module_space = matchobj.group(1) |
| 169 | + break |
| 170 | + |
| 171 | + raise AssertionError('Cannot find .runfiles directory for %s' % |
| 172 | + sys.argv[0]) |
| 173 | + return module_space |
| 174 | +``` |
| 175 | + |
| 176 | +However, this package search logic may have some problems if we install from the Python wheel. And we would discuss the solution later. |
| 177 | + |
| 178 | +## Build pip package for both logging and rendering |
| 179 | +Now we’ve finished all preparations for building a Python wheel. As we have a binary file and want to launch TensorBoard in the following way: |
| 180 | + |
| 181 | +```bash |
| 182 | +$ tensorboard --logdir=path-to-event-file/ |
| 183 | +``` |
| 184 | + |
| 185 | +It could be done by assigning binary script entry in `setup.py`, see [tensorboard/setup.py](https://github.com/dmlc/tensorboard/blob/master/python/setup.py) for more. The basic idea of a setup function, in this scenario, is to pack all necessary dependencies and setup the binary file correctly. |
| 186 | + |
| 187 | +Ops, back to the `FindModuleSpace` issue mentioned above. The `tensorboard` binary file would be copied to `/Users/zihao.zzh/anaconda2/envs/tensorboard/bin/tensorboard`, for example. But when you launch TensorBoard, it searches the `.runfiles` in the same directory, so we make a [tensorboard/tensorboard-binary.patch](https://github.com/dmlc/tensorboard/blob/master/tensorboard-binary.patch) and apply this patch in the building process. |
| 188 | + |
| 189 | +It provides an `installer.sh` to automate this process for you, and you might want to check it out here: [tensorboard/installer.sh](https://github.com/dmlc/tensorboard/blob/master/installer.sh) |
| 190 | + |
| 191 | +## How to use? |
| 192 | + |
| 193 | +We create a tutorial to understand the vanish gradient problem through visualization under `docs/tutorial/` |
| 194 | + |
| 195 | +## Future works |
| 196 | +* Add this component as submodule in MXNet. |
| 197 | +* Supports `image`, `video`, `embedding` and even `graph` for MXNet. |
| 198 | +* Make package installation more easier by providing a pre-built wheel, then users could install this package in one line. |
| 199 | + |
| 200 | +Feel free to join us and contribute! |
| 201 | + |
| 202 | +## About the author |
| 203 | +[Zihao Zheng](https://github.com/zihaolucky) is an algorithm engineer at AI Lab, Alibaba Group. Before joined the industry, he studied Mathematics at South China Normal University and learned Machine Learning and Computer Science from MOOCs. |
| 204 | + |
| 205 | +Many thanks to the community, TensorBoard authors and contributors!! |
| 206 | + |
| 207 | + |
0 commit comments