Skip to content

Commit ef19e29

Browse files
wfehrnstromKimeiga
authored andcommitted
Dockerfile/CI/CD (#143)
* Added extension for github pages to host our documentation * We are Mappening * Fixed legacy api for category search * Changed name on license * Updated database access to new setup * Reorged documentation * TODO: figure out why this manual change is necessary and how to avoid it * Updated README * Failed to load resource, jQuery * Theme not loading * Last fix to doc pages * Added favicon to docs * Starting filter implementation * Filters skeleton * make new eventbrite scrape folder * fix old printing bugs from previous branch (add ()) * merge in all the log stats changes, to catch up (UNTESTED since API broke) * does not break on Python 3 because of weird ssl environment issues * WHY DOESN'T THIS WORK * Implemented event filtering * Moved routes/docs to events.py * temp save, begin adding args * Implemented filtering by nearby events * get events of certain location and time * Fixed implementation for multiple filters, needs testing * Time periods now check start and end time * Improved time period filtering/interval checking * Kind of make some things shorter, with some basic testing done * edit comment for popularity parameter of filtering * Fixed comments for filter updates * Updated documentation to include filtering * Changed types of vars to correct types in documentation * Tiny change so that events are not classified in more than 1 period (boundary edge case) * Isolate my model and rewrite prob prediction function * Finalize model saving to file * Write autocategorization function using on disk models * Finalize interface and create new collection for categorized events * Update requirements.txt * Added rough locations documentation * put eventbrite events in a json * Breaking larger functions into parts * Breaking larger functions into parts * Breaking larger functions into parts * Fixed api routes * Split location utils into helper file * Writing documentation * Updated location documentation * Revamped users/auth. TODO: testing and filters * Minor changes from testing API with postman * Added functions to interact with user-specific app information * added user documentation * Updated documentation to include users. TODO: incorporate locations documentation... should've had that merged first * Updated user object documentation * Change Dockerfile and other detail changes * Add smaller model to github for ez training * Update comments * Add some comments to figure out how to integrate other website sources * Upgrade api v2 to using processed events * change some names for now, to prepare for eventbrite entering (ALSO MAY NOT WORK) * Delete legacy code * change more db names throughout events / location usages, everything's probly broken now * move the eventbrite scraping file * Fix requirements.txt hopefully * nothing * Added small thing for searching by month(and year) * Moved documentation to mappening.io * Sneaking in change for dev instance * Make /categories work and make event key "categories" instead of "category" * Fix Error * Lmao I forgot how months work * Fix categories too * Tiny documentation change * Newly generated documentation * Update Dockerfile * Fixed css for documentation * Fixed css for documentation * Fix /categories to have all updated categories * Change tokenize.py to tokenizer.py, begin merging in eventbrite * Minor changes * put eventbrite key in .env instead * temp change some names * log any output from AWS onto mLab now * Eventbrite events called automatically now, with edit to categorization (for directory changing) * clean up: move common definitions to own dir, update correct old events, change dir when needed correctly * Add model Creation for free food * add label free food function * Change dockerfile to make new builds work, and help revolve version of packages downloaded from alpine * Fixed minor spacing + added a few comments * Update README.md * Add google user login authentication * Added Dockerfile for dev purposes * Running prod on app.py now requires cmd arg * Change database collection source * Make changes to database on login * Only allow ucla email logins * added fuzzy string matching script test * added function to get closest match * Cleaned up app.py * Routes to add and remove favorite events * Minor changes * finished creating abbreviations map * combined abbrevations map with alternative names in db * making changes in locations endpoint * Make changes to link backend to frontend * added fuzzy string matching script test * added function to get closest match * finished creating abbreviations map * combined abbrevations map with alternative names in db * making changes in locations endpoint * added fuzzy location search * Moved files around, cleaned up dev changes * updated fuzzy with testing functionality * added apprevations search * Added test function to return top N results * Add freefood to current labeling flow * Add prod Dockerfile * Fixed breaking database line * added manual event insertion to database * fixed date and category parsing * Updated README for AWS instructions * Commit * added latitude longitude manual * Update get_fbscraper.py * Update eventbrite_scraper.py * commits added * added free food * fixed the damn bug * added latitude and longitude checking and cover image * removed comments * Update events.py * Update events.py * Added new package flask-compress which automatically gzips all files served. * added new collection for manually inserted events * removed misc import * added eventbrite collection back in * added find_dict back in to search events * commits * commits * commits * commits * Some basic module renaming and refactoring * Added definitions, cleaned comments, split eb functionality * Added example .env file * Fixed bugs from refactor * Moved geojson to proper folder and fixed bug in location search * Moved to Docker-Compose with temporary local Postgres database * commits * Adding location data manually in progress * Removed test info * Fixed Behavior When No Search Term is Provided * Added locations data * Finished up first set of location/address data * Added temp/ file as restore file * Had some thoughts and got rid of organizer table * Fixed Decimal serialization and renamed g_user to user_account * Update temp restore data folder * Add diagram of database schema as of now * Decided that this was a bad idea * Moved from temp/ folder to zipped file and make commands for convenience * Removed commented import statement * Moved Dockerfiles to Python3, initial refactor to reduce build time * Cleaned up docker files. Python3 now supported, a base image to build off of added * Small fix or args order * Modify configuration for EB deployment * Made sure dev/prod modes worked. BUG: pickle is broken? * Temporarily keep v2 prefix * Fixed bug, must read and write in binary format * Fixed more python 3 things * Change default Docker prod flag * Remove thread scheduler call, TODO: replace with cron jobs * Updated README, cleaned Makefile * Added api/ back to route prefixes * Updated README * Moved from pickle to joblib * Should test that thread_scheduler actually calls three times a day * Renamed db to mappening * added updated puppeteer script for marking events as interested in facebook * fixed shapely not finding geos-3.7.2 correctly by building geos-3.6.4 from source. Updated maintainers field and changed defualt timeout to allow for building over a slow connection. * Add integration build to run on pull requests. * fixed cover image collecting with curl. Began adding fb to update_ucla_events_database * made inclusion of .env optional based on which runner (human user vs. github actions is used). The runner is default set to USER, but github actions workflows must set the runner to GITHUB to bypass the .env, which the repository doesn't have for security reasons. * fix worflow syntax. * test yaml formatting. * test yaml formatting 2. * test yaml formatting 3. * test yaml formatting 4. * test Makefile. * test Makefile 2. * Silence geos make. Fixed syntax and .env conflicting with the continuous integration by making the .env file only one of two routes to get needed credentials. * almost done with everything for bmaps facebook events * Fix .env workflow issues, auto-build mappening/base on not finding it among docker images listed.
1 parent d9b17f1 commit ef19e29

File tree

13 files changed

+1234
-23
lines changed

13 files changed

+1234
-23
lines changed

.github/workflows/build_for_prod.yml

+43
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
name: Production Server Build Dry-Run
2+
3+
on: [pull_request]
4+
5+
jobs:
6+
build:
7+
runs-on: ubuntu-latest
8+
steps:
9+
- uses: actions/checkout@v1
10+
# can't really use caching here AFAIK as pip doesn't store installed packages in repo
11+
- name: Build Production Server
12+
env:
13+
RUNNER: GITHUB
14+
BASE_NAME: ${{ secrets.BASE_NAME }}
15+
BASE_DOCKERFILE: ${{ secrets.BASE_DOCKERFILE }}
16+
DEV_DOCKER_COMPOSE: ${{ secrets.DEV_DOCKER_COMPOSE }}
17+
AWS_PG_URL: ${{ secrets.AWS_PG_URL }}
18+
POSTGRES_URL: ${{ secrets.POSTGRES_URL }}
19+
POSTGRES_PORT: ${{ secrets.POSTGRES_PORT }}
20+
POSTGRES_USER: ${{ secrets.POSTGRES_USER }}
21+
POSTGRES_PASSWORD: ${{ secrets.POSTGRES_PASSWORD }}
22+
POSTGRES_DB: ${{ secrets.POSTGRES_DB }}
23+
POSTGRES_IMAGE: ${{ secrets.POSTGRES_IMAGE }}
24+
API_SERVER_PORT: ${{ secrets.API_SERVER_PORT }}
25+
FACEBOOK_APP_ID: ${{ secrets.FACEBOOK_APP_ID }}
26+
FACEBOOK_APP_SECRET: ${{ secrets.FACEBOOK_APP_SECRET }}
27+
FACEBOOK_SECRET_KEY: ${{ secrets.FACEBOOK_SECRET_KEY }}
28+
FACEBOOK_USER_ACCESS_TOKEN: ${{ secrets.FACEBOOK_USER_ACCESS_TOKEN }}
29+
GOOGLE_CLIENT_ID: ${{ secrets.GOOGLE_CLIENT_ID }}
30+
GOOGLE_CLIENT_SECRET: ${{ secrets.GOOGLE_CLIENT_SECRET }}
31+
MLAB_USERNAME: ${{ secrets.MLAB_USERNAME }}
32+
MLAB_PASSWORD: ${{ secrets.MLAB_PASSWORD }}
33+
MLAB_HOST: ${{ secrets.MLAB_HOST }}
34+
POSTGRES_URI: ${{ secrets.POSTGRES_URI }}
35+
AWS_PG_URI: ${{ secrets.AWS_PG_URI }}
36+
APP_SECRET_KEY: ${{ secrets.APP_SECRET_KEY }}
37+
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
38+
GOOGLE_API_KEY_1: ${{ secrets.GOOGLE_API_KEY_1 }}
39+
GOOGLE_API_KEY_2: ${{ secrets.GOOGLE_API_KEY_2 }}
40+
GOOGLE_API_KEY_3: ${{ secrets.GOOGLE_API_KEY_3 }}
41+
GOOGLE_API_KEY_4: ${{ secrets.GOOGLE_API_KEY_4 }}
42+
EVENTBRITE_USER_KEY: ${{ secrets.EVENTBRITE_USER_KEY }}
43+
run: make prod

Makefile

+16-4
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,30 @@
1-
# Include Environment Variables from .env file
2-
include .env
1+
RUNNER?=USER
2+
3+
# Include Environment Variables from .env file if run as user
4+
ifeq ($(RUNNER), USER)
5+
include .env
6+
endif
37

48
################## LOCAL DEVELOPMENT (Backend Only) ##################
59

610
# Build backend image. Must be built before dev work (and only once unless changed)
711
build-base:
812
docker build ./src -t $(BASE_NAME) -f $(BASE_DOCKERFILE)
913

14+
get-base:
15+
ifneq ($(shell docker images --filter=reference="$(BASE_NAME)" --format "{{.Repository}}"), $(BASE_NAME))
16+
#ifeq ($(shell docker inspect "$(BASE_NAME)"; echo "$?"), 0)
17+
# docker pull $(BASE_NAME)
18+
#else
19+
make build-base
20+
endif
21+
1022
# Run backend in dev mode with local Postgres database
11-
dev:
23+
dev: get-base
1224
docker-compose -f $(DEV_DOCKER_COMPOSE) up --build
1325

1426
# Run backend in prod mode with AWS Postgres database
15-
prod:
27+
prod: get-base
1628
docker-compose up --build
1729

1830
# Stops the stack. Can also Ctrl+C in the same terminal window stack was run.

scraping/puppeteer/facebook.js

+290
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,290 @@
1+
const puppeteer = require('puppeteer');
2+
3+
4+
const USERNAME_SELECTOR = '#email'
5+
const PASSWORD_SELECTOR = '#pass'
6+
const LOGIN_SELECTOR = '#loginbutton'
7+
8+
let anchorHref;
9+
10+
async function run() {
11+
12+
// launch puppeteer
13+
const browser = await puppeteer.launch({
14+
headless: false,
15+
args: ['--disable-notifications']
16+
});
17+
18+
// log in to facebook
19+
const page = await browser.newPage();
20+
await page.goto('https://facebook.com');
21+
22+
await page.click(USERNAME_SELECTOR);
23+
await page.keyboard.type("[email protected]");
24+
25+
await page.click(PASSWORD_SELECTOR);
26+
await page.keyboard.type("Mappening 2019");
27+
28+
await page.click(LOGIN_SELECTOR);
29+
await page.waitForNavigation();
30+
31+
// first endpoint
32+
await page.goto('https://www.facebook.com/search/events/?q=ucla');
33+
34+
// turn off asking to show notifications
35+
page.on('dialog', async dialog => {
36+
console.log(dialog.message());
37+
await dialog.dismiss();
38+
})
39+
40+
41+
let handles = [];
42+
43+
let keepCalling = true;
44+
let keepCallingTimeout = setTimeout(function () {
45+
keepCalling = false;
46+
}, 30000);
47+
48+
// Date: Any Date
49+
50+
while(true) {
51+
52+
await page.evaluate('window.scrollTo(0, document.body.scrollHeight)');
53+
54+
handles = await page.$$('.fbEventAttachmentCTAButton');
55+
56+
57+
console.log(handles.length);
58+
59+
if(handles.length != 0) {
60+
keepCalling = true;
61+
clearTimeout(keepCallingTimeout);
62+
keepCallingTimeout = setTimeout(function () {
63+
keepCalling = false;
64+
}, 30000);
65+
}
66+
67+
// if there are no handles just go to the next page
68+
if (handles.length == 0 && keepCalling == false) {
69+
break;
70+
}
71+
72+
for (let interested of handles)
73+
await interested.click();
74+
75+
}
76+
77+
keepCalling = true;
78+
keepCallingTimeout = setTimeout(function () {
79+
keepCalling = false;
80+
}, 30000);
81+
82+
83+
84+
// Date: Today
85+
86+
dateRangeAnchors = await page.$$('._4f3b');
87+
console.log(dateRangeAnchors);
88+
anchorHref = await page.evaluate(anchor => anchor.getAttribute('href'), dateRangeAnchors[5]);
89+
console.log(anchorHref);
90+
await page.goto(anchorHref);
91+
92+
93+
while(true) {
94+
95+
await page.evaluate('window.scrollTo(0, document.body.scrollHeight)');
96+
97+
handles = await page.$$('.fbEventAttachmentCTAButton');
98+
99+
100+
console.log(handles.length);
101+
102+
if(handles.length != 0) {
103+
keepCalling = true;
104+
clearTimeout(keepCallingTimeout);
105+
keepCallingTimeout = setTimeout(function () {
106+
keepCalling = false;
107+
}, 30000);
108+
}
109+
110+
// if there are no handles just go to the next page
111+
if (handles.length == 0 && keepCalling == false) {
112+
break;
113+
}
114+
115+
for (let interested of handles)
116+
await interested.click();
117+
118+
}
119+
120+
121+
keepCalling = true;
122+
keepCallingTimeout = setTimeout(function () {
123+
keepCalling = false;
124+
}, 30000);
125+
126+
// Date: Tomorrow
127+
128+
dateRangeAnchors = await page.$$('._4f3b');
129+
console.log(dateRangeAnchors);
130+
anchorHref = await page.evaluate(anchor => anchor.getAttribute('href'), dateRangeAnchors[6]);
131+
console.log(anchorHref);
132+
await page.goto(anchorHref);
133+
134+
while(true) {
135+
136+
await page.evaluate('window.scrollTo(0, document.body.scrollHeight)');
137+
138+
handles = await page.$$('.fbEventAttachmentCTAButton');
139+
140+
141+
console.log(handles.length);
142+
143+
if(handles.length != 0) {
144+
keepCalling = true;
145+
clearTimeout(keepCallingTimeout);
146+
keepCallingTimeout = setTimeout(function () {
147+
keepCalling = false;
148+
}, 30000);
149+
}
150+
151+
// if there are no handles just go to the next page
152+
if (handles.length == 0 && keepCalling == false) {
153+
break;
154+
}
155+
156+
for (let interested of handles)
157+
await interested.click();
158+
159+
}
160+
161+
162+
keepCalling = true;
163+
keepCallingTimeout = setTimeout(function () {
164+
keepCalling = false;
165+
}, 30000);
166+
167+
// Date: This Week
168+
169+
dateRangeAnchors = await page.$$('._4f3b');
170+
console.log(dateRangeAnchors);
171+
anchorHref = await page.evaluate(anchor => anchor.getAttribute('href'), dateRangeAnchors[7]);
172+
console.log(anchorHref);
173+
await page.goto(anchorHref);
174+
175+
while(true) {
176+
177+
await page.evaluate('window.scrollTo(0, document.body.scrollHeight)');
178+
179+
handles = await page.$$('.fbEventAttachmentCTAButton');
180+
181+
182+
console.log(handles.length);
183+
184+
if(handles.length != 0) {
185+
keepCalling = true;
186+
clearTimeout(keepCallingTimeout);
187+
keepCallingTimeout = setTimeout(function () {
188+
keepCalling = false;
189+
}, 30000);
190+
}
191+
192+
// if there are no handles just go to the next page
193+
if (handles.length == 0 && keepCalling == false) {
194+
break;
195+
}
196+
197+
for (let interested of handles)
198+
await interested.click();
199+
200+
}
201+
202+
203+
keepCalling = true;
204+
keepCallingTimeout = setTimeout(function () {
205+
keepCalling = false;
206+
}, 30000);
207+
208+
// Date: This Weekend
209+
210+
dateRangeAnchors = await page.$$('._4f3b');
211+
console.log(dateRangeAnchors);
212+
anchorHref = await page.evaluate(anchor => anchor.getAttribute('href'), dateRangeAnchors[8]);
213+
console.log(anchorHref);
214+
await page.goto(anchorHref);
215+
216+
while(true) {
217+
218+
await page.evaluate('window.scrollTo(0, document.body.scrollHeight)');
219+
220+
handles = await page.$$('.fbEventAttachmentCTAButton');
221+
222+
223+
console.log(handles.length);
224+
225+
if(handles.length != 0) {
226+
keepCalling = true;
227+
clearTimeout(keepCallingTimeout);
228+
keepCallingTimeout = setTimeout(function () {
229+
keepCalling = false;
230+
}, 30000);
231+
}
232+
233+
// if there are no handles just go to the next page
234+
if (handles.length == 0 && keepCalling == false) {
235+
break;
236+
}
237+
238+
for (let interested of handles)
239+
await interested.click();
240+
241+
}
242+
243+
244+
keepCalling = true;
245+
keepCallingTimeout = setTimeout(function () {
246+
keepCalling = false;
247+
}, 30000);
248+
249+
// Date: This Weekend
250+
251+
dateRangeAnchors = await page.$$('._4f3b');
252+
console.log(dateRangeAnchors);
253+
anchorHref = await page.evaluate(anchor => anchor.getAttribute('href'), dateRangeAnchors[9]);
254+
console.log(anchorHref);
255+
await page.goto(anchorHref);
256+
257+
while(true) {
258+
259+
await page.evaluate('window.scrollTo(0, document.body.scrollHeight)');
260+
261+
handles = await page.$$('.fbEventAttachmentCTAButton');
262+
263+
264+
console.log(handles.length);
265+
266+
if(handles.length != 0) {
267+
keepCalling = true;
268+
clearTimeout(keepCallingTimeout);
269+
keepCallingTimeout = setTimeout(function () {
270+
keepCalling = false;
271+
}, 30000);
272+
}
273+
274+
// if there are no handles just go to the next page
275+
if (handles.length == 0 && keepCalling == false) {
276+
break;
277+
}
278+
279+
for (let interested of handles)
280+
await interested.click();
281+
282+
}
283+
284+
285+
286+
browser.close();
287+
288+
}
289+
290+
run();

0 commit comments

Comments
 (0)