Skip to content

Latest commit

 

History

History

K Nearest Neighbours

K Nearest Neighbours

For a detailed analysis, inclusing a qualitative and quantitative analysis, please see the report.

Dependencies

Since this is a go program, the dependencies can be downloaded simply by running

go mod tidy

To set up protobuf for golang, install go plugins for the protocol compiler as in the docs. You can build from the protobuf files, by running

bash proto.sh

You can create a data-set to test on, by running

python create_data.py

You may alter the number of data points, number of floating digits, etc. in this python script.

Running

Single Command

A sample launch script is provided in launch.sh. Simply run (assuming you have gnome-terminal):

bash launch.sh

Multiple Commands

Launch each server by running:

go run server/main.go "active_servers.txt"

the servers write their port numbers to this file. This allows for the dynamic introduction of servers for scalability.

to create random data to test on, run:

python create_data.py

else, store your own data in data.txt

To partition the data in data.txt and send it to the servers in active_servers.txt, run:

go run send-data/main.go

Finally, launch the client, that prompts for the data-point and the number of it's nearest neighbours to find:

go run client/main.go --port_file=active_servers.txt

The output is printed to the terminal, and written to the file nn_<num-nearest>_<data-point>.txt. Note that the client need not be aware of the amount of data-points in each server. Each line of the output is of the form <nearest-neighbour> -> <distance-from-datapoint>, until the time taken at the end.