-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathdataset_fetcher.html
173 lines (157 loc) · 11.7 KB
/
dataset_fetcher.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html><head><title>Python: module dataset_fetcher</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head><body bgcolor="#f0f0f8">
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="heading">
<tr bgcolor="#7799ee">
<td valign=bottom> <br>
<font color="#ffffff" face="helvetica, arial"> <br><big><big><strong>dataset_fetcher</strong></big></big></font></td
><td align=right valign=bottom
><font color="#ffffff" face="helvetica, arial"><a href=".">index</a><br><a href="file:/home/ubuntu/Documents/studies/3_1/IR_CS_F469/assn2/IR2/src/dataset_fetcher.py">/home/ubuntu/Documents/studies/3_1/IR_CS_F469/assn2/IR2/src/dataset_fetcher.py</a></font></td></tr></table>
<p></p>
<p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#aa55cc">
<td colspan=3 valign=bottom> <br>
<font color="#ffffff" face="helvetica, arial"><big><strong>Modules</strong></big></font></td></tr>
<tr><td bgcolor="#aa55cc"><tt> </tt></td><td> </td>
<td width="100%"><table width="100%" summary="list"><tr><td width="25%" valign=top><a href="numpy.html">numpy</a><br>
<a href="pickle.html">pickle</a><br>
</td><td width="25%" valign=top><a href="queue.html">queue</a><br>
<a href="scipy.sparse.html">scipy.sparse</a><br>
</td><td width="25%" valign=top><a href="sys.html">sys</a><br>
<a href="time.html">time</a><br>
</td><td width="25%" valign=top><a href="tweepy.html">tweepy</a><br>
</td></tr></table></td></tr></table><p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#ee77aa">
<td colspan=3 valign=bottom> <br>
<font color="#ffffff" face="helvetica, arial"><big><strong>Classes</strong></big></font></td></tr>
<tr><td bgcolor="#ee77aa"><tt> </tt></td><td> </td>
<td width="100%"><dl>
<dt><font face="helvetica, arial"><a href="builtins.html#object">builtins.object</a>
</font></dt><dd>
<dl>
<dt><font face="helvetica, arial"><a href="dataset_fetcher.html#DatasetFetcher">DatasetFetcher</a>
</font></dt><dt><font face="helvetica, arial"><a href="dataset_fetcher.html#ListToMatrixConverter">ListToMatrixConverter</a>
</font></dt><dt><font face="helvetica, arial"><a href="dataset_fetcher.html#Logger">Logger</a>
</font></dt></dl>
</dd>
</dl>
<p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#ffc8d8">
<td colspan=3 valign=bottom> <br>
<font color="#000000" face="helvetica, arial"><a name="DatasetFetcher">class <strong>DatasetFetcher</strong></a>(<a href="builtins.html#object">builtins.object</a>)</font></td></tr>
<tr bgcolor="#ffc8d8"><td rowspan=2><tt> </tt></td>
<td colspan=2><tt>An instance of <a href="#DatasetFetcher">DatasetFetcher</a> is used to obtain the dataset from<br>
the internet<br> </tt></td></tr>
<tr><td> </td>
<td width="100%">Methods defined here:<br>
<dl><dt><a name="DatasetFetcher-__init__"><strong>__init__</strong></a>(self, key, secret, logger)</dt><dd><tt>Initializes an instance of <a href="#DatasetFetcher">DatasetFetcher</a><br>
<br>
Args:<br>
key: key to be used for authentication<br>
secret: secret to be used for authentication<br>
logger: An instance of <a href="#Logger">Logger</a> to be used for logging purposed by public<br>
member functions</tt></dd></dl>
<dl><dt><a name="DatasetFetcher-get_dataset"><strong>get_dataset</strong></a>(self, seed_user, friends_limit, followers_limit, limit, live_save, users_path, adj_list_path)</dt><dd><tt>Obtain the dataset<br>
<br>
Args:<br>
seed_user: id/screen_name/name of the user to start the bfs with<br>
friends_limit: Maximum number of friends to consider for each user<br>
followers_limit: Maximum number of followers to consider for each user<br>
limit: Maximum number of users to find friends and followers of<br>
live_save: Whether to save computed data frequently<br>
users_path: Path to the file where the users info will be stored<br>
<br>
adj_list_path:</tt></dd></dl>
<dl><dt><a name="DatasetFetcher-save_dataset"><strong>save_dataset</strong></a>(self, users_path, adj_list_path)</dt><dd><tt>Save the dataset obtained by get_dataset<br>
<br>
Args:<br>
users_path: Path to the file where users info will be stored<br>
adj_list_path: Path to the file where the adjacency list will be stored</tt></dd></dl>
<hr>
Data descriptors defined here:<br>
<dl><dt><strong>__dict__</strong></dt>
<dd><tt>dictionary for instance variables (if defined)</tt></dd>
</dl>
<dl><dt><strong>__weakref__</strong></dt>
<dd><tt>list of weak references to the object (if defined)</tt></dd>
</dl>
</td></tr></table> <p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#ffc8d8">
<td colspan=3 valign=bottom> <br>
<font color="#000000" face="helvetica, arial"><a name="ListToMatrixConverter">class <strong>ListToMatrixConverter</strong></a>(<a href="builtins.html#object">builtins.object</a>)</font></td></tr>
<tr bgcolor="#ffc8d8"><td rowspan=2><tt> </tt></td>
<td colspan=2><tt>An instance of <a href="#ListToMatrixConverter">ListToMatrixConverter</a> is used to convert the data obtained<br>
by the dataset fetcher from adjacency list form to a matrix form (and an<br>
index-to-userid map)<br> </tt></td></tr>
<tr><td> </td>
<td width="100%">Methods defined here:<br>
<dl><dt><a name="ListToMatrixConverter-__init__"><strong>__init__</strong></a>(self, adj_list_path)</dt><dd><tt>Initializes an instance of <a href="#ListToMatrixConverter">ListToMatrixConverter</a><br>
<br>
Args:<br>
adj_list_path: Path to the file where the adjacency list is stored</tt></dd></dl>
<dl><dt><a name="ListToMatrixConverter-convert"><strong>convert</strong></a>(self)</dt><dd><tt>Use the adjacency list to create the link matrix and a dictionary that<br>
maps the index in the link matrix to a user id</tt></dd></dl>
<dl><dt><a name="ListToMatrixConverter-save"><strong>save</strong></a>(self, map_path, link_matrix_path, use_sparse=False)</dt><dd><tt>Saves the map and link matrix created using the convert function<br>
<br>
Args:<br>
map_path: Path to the file where the map from link matrix index to<br>
user id is to be stored<br>
link_matrix_path: Path to the file where the link matrix is to be stored<br>
use_sparse: True if the link matrix is to be stored as a sparse matrix</tt></dd></dl>
<hr>
Data descriptors defined here:<br>
<dl><dt><strong>__dict__</strong></dt>
<dd><tt>dictionary for instance variables (if defined)</tt></dd>
</dl>
<dl><dt><strong>__weakref__</strong></dt>
<dd><tt>list of weak references to the object (if defined)</tt></dd>
</dl>
</td></tr></table> <p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#ffc8d8">
<td colspan=3 valign=bottom> <br>
<font color="#000000" face="helvetica, arial"><a name="Logger">class <strong>Logger</strong></a>(<a href="builtins.html#object">builtins.object</a>)</font></td></tr>
<tr bgcolor="#ffc8d8"><td rowspan=2><tt> </tt></td>
<td colspan=2><tt>An instance of <a href="#Logger">Logger</a> can be used as a simple and intuitive interface<br>
for logging<br> </tt></td></tr>
<tr><td> </td>
<td width="100%">Methods defined here:<br>
<dl><dt><a name="Logger-__del__"><strong>__del__</strong></a>(self)</dt><dd><tt>Close the log file when no references to the instance remain</tt></dd></dl>
<dl><dt><a name="Logger-__init__"><strong>__init__</strong></a>(self, log_path, print_stdout=True, sep=' ', end='\n')</dt><dd><tt>Initializes an instance of <a href="#Logger">Logger</a><br>
<br>
Args:<br>
log_path: Path to the file to write the logs to<br>
print_stdout: True if the logs must be written to stdout<br>
sep: string to be used to separate arguments of printing<br>
end: string to be after the last argument of printing</tt></dd></dl>
<dl><dt><a name="Logger-log"><strong>log</strong></a>(self, *args)</dt><dd><tt>Logs whatever is present in args with current date and time<br>
<br>
Uses instance variables self.<strong>_sep</strong> for separating elements of args and<br>
self.<strong>_end</strong> after the last element of args. Writes to the log file<br>
self.<strong>_log_file</strong>. If self.<strong>_print_stdout</strong> is True, logs are also written to<br>
stdout<br>
<br>
Args:<br>
args: List of elements to be logged</tt></dd></dl>
<hr>
Data descriptors defined here:<br>
<dl><dt><strong>__dict__</strong></dt>
<dd><tt>dictionary for instance variables (if defined)</tt></dd>
</dl>
<dl><dt><strong>__weakref__</strong></dt>
<dd><tt>list of weak references to the object (if defined)</tt></dd>
</dl>
</td></tr></table></td></tr></table><p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#eeaa77">
<td colspan=3 valign=bottom> <br>
<font color="#ffffff" face="helvetica, arial"><big><strong>Functions</strong></big></font></td></tr>
<tr><td bgcolor="#eeaa77"><tt> </tt></td><td> </td>
<td width="100%"><dl><dt><a name="-main"><strong>main</strong></a>()</dt></dl>
</td></tr></table>
</body></html>