Tree-based Convolutional Neural Network with Attention Unit

A modified version of Tree-based convolutional neural network using Global Attention vector in the Aggregation Layer instead of the Dynamic Pooling

How to run

First, run:

bash script.sh

to extract the pretrained embedding and download the docker image of f-ast tool.

For general training and testing:

python3 main.py --model_path "model/github_50_cpp_new" --n_classes 50 --training

For single file testing:

python3 live_test.py --model_path "model/github_50_cpp_new" --n_classes 50 --test_file "live_test/github_cpp/26/10.cpp"

There is a pretrained model in "model/github_50_cpp_new" for the Github C++ dataset. "The live_test" directory contains the raw programming files for testing. Once running the above command, there will be 9 files generated in the same directory of the test file, e.g in this case, they are:

10.pb: protobuf format of the AST
10.pkl: pickle format of the protobuf
10_ast.txt: string format of the AST, one can view the actual AST in this file
10_raw_attention_without_node_type.csv: a csv file contains the attention score for each node from the softmax layer (node_id, score)
10_raw_attention_with_node_type.csv: a csv file contains the attention score for each node from the softmax layer with the node type (node_id, node_type, score)
10_scaled_attention_without_node_type.csv: a csv file contains the attention score after scaled with Min-Max scaling to 0-1 (node_id, score)
10_scaled_attention_with_node_type.csv: a csv file contains the attention score after scaled with Min-Max scaling to 0-1, with node type (node_id, node_type, score)
10_normal.html: the visual reprentation of the important nodes by just taking the raw attention to visualize
10_accumulation.html: the visual reprentation of the important nodes, by spreading the score of a parent node to its children and visualize on the accumulated score

Note

If you run any of the above scripts for the first time, it may take a while since the code needs to load the data and do the caching. From the second time, it will be much faster, since the data will be loaded from the cache.
One can check the node type in the file src_node_map.tsv

Name		Name	Last commit message	Last commit date
Latest commit History 291 Commits
ProgramData_pkl_train_test_val		ProgramData_pkl_train_test_val
analysis_single		analysis_single
embedding		embedding
github_code		github_code
github_code_index/java		github_code_index/java
github_code_index_sort		github_code_index_sort
github_cpp		github_cpp
github_cpp_pb		github_cpp_pb
github_cpp_pkl		github_cpp_pkl
github_cpp_pkl_train_test_val		github_cpp_pkl_train_test_val
github_cpp_sort_pkl_train_test_val		github_cpp_sort_pkl_train_test_val
github_java		github_java
github_java_pairwise_visualization		github_java_pairwise_visualization
github_java_sort		github_java_sort
github_java_sort_function		github_java_sort_function
github_java_sort_function_histogram		github_java_sort_function_histogram
github_java_sort_function_pb		github_java_sort_function_pb
github_java_sort_function_pkl_train_test_val		github_java_sort_function_pkl_train_test_val
github_java_sort_pkl_train_test_val		github_java_sort_pkl_train_test_val
live_test		live_test
manual_labeling		manual_labeling
manual_test		manual_test
matching_matrix		matching_matrix
model		model
others		others
pairwise		pairwise
scripts		scripts
test		test
.gitignore		.gitignore
.gitpod.yml		.gitpod.yml
Dockerfile		Dockerfile
README.md		README.md
algorithms.txt		algorithms.txt
algorithms_name.txt		algorithms_name.txt
algorithms_name_sort.txt		algorithms_name_sort.txt
analysis.py		analysis.py
analysis_bilateral.py		analysis_bilateral.py
analysis_bilateral_decomposable.py		analysis_bilateral_decomposable.py
analysis_single.py		analysis_single.py
attention-tbcnn.py		attention-tbcnn.py
bi-attention-tbcnn.py		bi-attention-tbcnn.py
bi-decomposable-attention-no-tbcnn.py		bi-decomposable-attention-no-tbcnn.py
bi-decomposable-attention-tbcnn.py		bi-decomposable-attention-tbcnn.py
data_loader.py		data_loader.py
draw_distribution.py		draw_distribution.py
fast_pb2.py		fast_pb2.py
fast_pb2_old.py		fast_pb2_old.py
generate_function.py		generate_function.py
live_test.py		live_test.py
main.py		main.py
main_2.py		main_2.py
main_old.py		main_old.py
network.py		network.py
network_2.py		network_2.py
network_old.py		network_old.py
node_map.py		node_map.py
note.txt		note.txt
original_sampling.py		original_sampling.py
pandas_test.py		pandas_test.py
parameters.py		parameters.py
prepare_data.py		prepare_data.py
process_algorithms_name.py		process_algorithms_name.py
requirements.txt		requirements.txt
sampling.py		sampling.py
script.sh		script.sh
src_node_map.tsv		src_node_map.tsv
srcml_node_map.py		srcml_node_map.py
t.sh		t.sh
temp_sampling.py		temp_sampling.py
test_pad.py		test_pad.py
test_scaling.py		test_scaling.py
tf_test.py		tf_test.py
train_ae.py		train_ae.py
train_vae.py		train_vae.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tree-based Convolutional Neural Network with Attention Unit

How to run

Note

About

Releases

Packages

Languages

CandraTP/tbcnn-attention

Folders and files

Latest commit

History

Repository files navigation

Tree-based Convolutional Neural Network with Attention Unit

How to run

Note

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages