Skip to content

Commit 5d980e1

Browse files
zihaoluckymli
authored andcommitted
Add TensorBoard post. (#35)
1 parent 24ffaf7 commit 5d980e1

File tree

1 file changed

+207
-0
lines changed

1 file changed

+207
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
---
2+
layout: post
3+
title: "Bring TensorBoard to MXNet"
4+
date: 2017-01-07 00.00.00 -0800
5+
author: Zihao Zheng
6+
---
7+
8+
With proper visualization, we could have a better understanding of the mechanism of Deep Learning.
9+
10+
* Monitoring training/testing metrics through the learning curve. You know how well your model learning from the data.
11+
* Visualizing the dynamics under the gradients of a layer with histogram, you know your network is live or dead(gradient).
12+
* By interpreting the embedding features from a layer, using t-SNE for high-dimensional data visualization, you get intuitions from its representation power.
13+
14+
There are way more techniques in visualizing neural networks than the above. So that’s why we want to build a handy tool for our MXNet users.
15+
16+
Thanks to the community, we already have [TensorBoard](https://www.tensorflow.org/versions/master/how_tos/graph_viz/index.html) and it’s easy-to-use and meets most daily use cases. However, TensorBoard is built together with TensorFlow and we have to come up a way to make a stand-alone version for general visualization purpose.
17+
18+
## Before we start
19+
It's my first time to get involved in an open-sourced project like MXNet, and it has some visualization solutions there. I found there’re several similar issues requests for TensorBoard-liked tool, some people want to build the tool from scratch, while [@piiswrong](https://github.com/piiswrong) asked whether is possible to strip TensorBoard from TensorFlow. I like the latter one, so I created an issue [dmlc/mxnet#4003](https://github.com/dmlc/mxnet/issues/4003) for discussion, proposed my solution and roadmap towards this direction.
20+
21+
## The Logging Part
22+
Technically, TensorBoard contains two parts: logging and rendering. In TensorBoard, it supports these types of data:
23+
24+
* Scalar.
25+
* Image.
26+
* Video.
27+
* Histogram.
28+
* Graph. The TensorFlow computational graph.
29+
* Embedding.
30+
31+
### Get summary without running TensorFlow
32+
33+
In TensorFlow, `summary` object could be generated by running a `session` or by running an operation, here's an example in [TensorBoard Document](https://www.tensorflow.org/how_tos/summaries_and_tensorboard/)
34+
35+
```python
36+
with tf.name_scope('cross_entropy'):
37+
# The raw formulation of cross-entropy,
38+
#
39+
# tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.softmax(y)),
40+
# reduction_indices=[1]))
41+
#
42+
# can be numerically unstable.
43+
#
44+
# So here we use tf.nn.softmax_cross_entropy_with_logits on the
45+
# raw outputs of the nn_layer above, and then average across
46+
# the batch.
47+
diff = tf.nn.softmax_cross_entropy_with_logits(y, y_)
48+
with tf.name_scope('total'):
49+
cross_entropy = tf.reduce_mean(diff)
50+
tf.summary.scalar('cross_entropy', cross_entropy)
51+
```
52+
53+
Luckily, those summaries are [Protocol Buffers](https://developers.google.com/protocol-buffers/) and it makes things easy, which means it could be changed into pure Python. Thanks to the community, I found [@mufeili](https://github.com/mufeili) has been working on this feature and TensorBoard has provided a relatively clean API for this purpose, which allows us to get these without running TensorFlow’s operation.
54+
55+
The logging part code is placed in [tensorboard/python](https://github.com/dmlc/tensorboard/tree/master/python).
56+
57+
### Logging events in pure Python
58+
59+
TensorBoard relies on the `Event` file, in which `summary` information is included in this file.
60+
61+
```bash
62+
$ tensorboard --logdir=path-to-event-files
63+
```
64+
65+
See [tensorboard/record_writer.py](https://github.com/dmlc/tensorboard/blob/master/python/tensorboard/record_writer.py) for details.
66+
67+
## The Rendering Part
68+
The next step is to build TensorBoard's rendering part. The goal is to provide an easy-to-maintain solution, so we didn’t try to strip the relevant codes and build it from scratch. Rather, we pull the TensorFlow codebase and use bazel to build TensorBoard:
69+
70+
```bash
71+
$ git clone https://github.com/tensorflow/tensorflow
72+
$ cd tensorflow
73+
$ ./configure
74+
$ bazel build tensorflow/tensorboard:tensorboard
75+
```
76+
77+
Then all dependencies could be found in `/bazel-bin/tensorflow/tensorboard`, a Python binary `tensorboard` to launch the app and its dependencies `tensorboard.runfiles`.
78+
79+
Here’s the file structure before we build TensorBoard(and move files):
80+
81+
```
82+
├── LICENSE
83+
├── Makefile
84+
├── README.md
85+
├── installer.sh
86+
├── python
87+
│ ├── README.md
88+
│ ├── setup.py
89+
│ └── tensorboard
90+
│ ├── __init__.py
91+
│ ├── crc32c.py
92+
│ ├── event_file_writer.py
93+
│ ├── record_writer.py
94+
│ ├── src
95+
│ │ └── __init__.py
96+
│ ├── summary.py
97+
│ └── writer.py
98+
├── tensorboard
99+
│ └── src
100+
│ ├── event.proto
101+
│ ├── resource_handle.proto
102+
│ ├── summary.proto
103+
│ ├── tensor.proto
104+
│ ├── tensor_shape.proto
105+
│ └── types.proto
106+
└── tools
107+
└── pip_package
108+
├── MANIFEST.in
109+
├── README
110+
└── build_pip_package.sh
111+
```
112+
113+
Then we get `tensorboard` and `tensorboard.runfiles/`:
114+
115+
```
116+
├── LICENSE
117+
├── Makefile
118+
├── README.md
119+
├── installer.sh
120+
├── python
121+
│ ├── MANIFEST.in
122+
│ ├── README
123+
│ ├── README.md
124+
│ ├── setup.py
125+
│ └── tensorboard
126+
│ ├── __init__.py
127+
│ ├── crc32c.py
128+
│ ├── event_file_writer.py
129+
│ ├── record_writer.py
130+
│ ├── src
131+
│ ├── summary.py
132+
│ ├── tensorboard <--- python binary
133+
│ ├── tensorboard.runfiles <--- directory/dependencies
134+
│ └── writer.py
135+
├── tensorboard
136+
│ └── src
137+
│ ├── event.proto
138+
│ ├── resource_handle.proto
139+
│ ├── summary.proto
140+
│ ├── tensor.proto
141+
│ ├── tensor_shape.proto
142+
│ └── types.proto
143+
└── tools
144+
└── pip_package
145+
├── MANIFEST.in
146+
├── README
147+
└── build_pip_package.sh
148+
```
149+
150+
The most important, the `tensorboard` binary is directly usable, as it searches and imports relevant packages:
151+
152+
```python
153+
# Find the runfiles tree
154+
def FindModuleSpace():
155+
# Follow symlinks, looking for my module space
156+
stub_filename = os.path.abspath(sys.argv[0])
157+
while True:
158+
# Found it?
159+
module_space = stub_filename + '.runfiles'
160+
if os.path.isdir(module_space):
161+
break
162+
163+
runfiles_pattern = "(.*\.runfiles)/.*"
164+
if IsWindows():
165+
runfiles_pattern = "(.*\.runfiles)\\.*"
166+
matchobj = re.match(runfiles_pattern, os.path.abspath(sys.argv[0]))
167+
if matchobj:
168+
module_space = matchobj.group(1)
169+
break
170+
171+
raise AssertionError('Cannot find .runfiles directory for %s' %
172+
sys.argv[0])
173+
return module_space
174+
```
175+
176+
However, this package search logic may have some problems if we install from the Python wheel. And we would discuss the solution later.
177+
178+
## Build pip package for both logging and rendering
179+
Now we’ve finished all preparations for building a Python wheel. As we have a binary file and want to launch TensorBoard in the following way:
180+
181+
```bash
182+
$ tensorboard --logdir=path-to-event-file/
183+
```
184+
185+
It could be done by assigning binary script entry in `setup.py`, see [tensorboard/setup.py](https://github.com/dmlc/tensorboard/blob/master/python/setup.py) for more. The basic idea of a setup function, in this scenario, is to pack all necessary dependencies and setup the binary file correctly.
186+
187+
Ops, back to the `FindModuleSpace` issue mentioned above. The `tensorboard` binary file would be copied to `/Users/zihao.zzh/anaconda2/envs/tensorboard/bin/tensorboard`, for example. But when you launch TensorBoard, it searches the `.runfiles` in the same directory, so we make a [tensorboard/tensorboard-binary.patch](https://github.com/dmlc/tensorboard/blob/master/tensorboard-binary.patch) and apply this patch in the building process.
188+
189+
It provides an `installer.sh` to automate this process for you, and you might want to check it out here: [tensorboard/installer.sh](https://github.com/dmlc/tensorboard/blob/master/installer.sh)
190+
191+
## How to use?
192+
193+
We create a tutorial to understand the vanish gradient problem through visualization under `docs/tutorial/`
194+
195+
## Future works
196+
* Add this component as submodule in MXNet.
197+
* Supports `image`, `video`, `embedding` and even `graph` for MXNet.
198+
* Make package installation more easier by providing a pre-built wheel, then users could install this package in one line.
199+
200+
Feel free to join us and contribute!
201+
202+
## About the author
203+
[Zihao Zheng](https://github.com/zihaolucky) is an algorithm engineer at AI Lab, Alibaba Group. Before joined the industry, he studied Mathematics at South China Normal University and learned Machine Learning and Computer Science from MOOCs.
204+
205+
Many thanks to the community, TensorBoard authors and contributors!!
206+
207+

0 commit comments

Comments
 (0)