For a detailed analysis, inclusing a qualitative and quantitative analysis, please see the report.
Since this is a go
program, the dependencies can be downloaded simply by running
go mod tidy
To set up protobuf for golang, install go plugins for the protocol compiler as in the docs. You can build from the protobuf files, by running
bash proto.sh
You can create a data-set to test on, by running
python create_data.py
You may alter the number of data points, number of floating digits, etc. in this python script.
A sample launch script is provided in launch.sh
. Simply run (assuming you have gnome-terminal):
bash launch.sh
Launch each server by running:
go run server/main.go "active_servers.txt"
the servers write their port numbers to this file. This allows for the dynamic introduction of servers for scalability.
to create random data to test on, run:
python create_data.py
else, store your own data in data.txt
To partition the data in data.txt
and send it to the servers in active_servers.txt
, run:
go run send-data/main.go
Finally, launch the client, that prompts for the data-point and the number of it's nearest neighbours to find:
go run client/main.go --port_file=active_servers.txt
The output is printed to the terminal, and written to the file nn_<num-nearest>_<data-point>.txt
. Note that the client need not be aware of the amount of data-points in each server.
Each line of the output is of the form <nearest-neighbour> -> <distance-from-datapoint>
, until the time taken at the end.