Skip to content

Commit c0390df

Browse files
pbailiebmcutler
authored andcommitted
Move files from main Submitty Repo (#1)
- Files moved from main Submity repo. - Readme files updated.
1 parent 95ed29a commit c0390df

File tree

18 files changed

+1872
-1
lines changed

18 files changed

+1872
-1
lines changed

README.md

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,18 @@
1-
# SysadminTools
1+
# Submitty Sysadmin Tools Repository
2+
3+
This Github repository contains system maintenance and automation scripts for
4+
Submitty used by Rensselaer Polytechnic Institute, Dept. of Computer Science.
5+
*Some modification of these scripts may be necessary to be compatible with your
6+
school's information systems.*
7+
8+
WARNING: Many of these scripts are intended to process private student
9+
information that is protected by [FERPA (20 U.S.C. § 1232g)]
10+
(https://www2.ed.gov/policy/gen/guid/fpco/ferpa/index.html).
11+
Please contact your school's IT dept. for advice on your school's data security
12+
practices.
13+
14+
Licensed under the [BSD 3-Clause License](LICENSE).
15+
16+
### User Documentation
17+
18+
- [Student Auto Feed](http://submitty.org/sysadmin/student_auto_feed)

nightly_db_backup/db_backup.py

Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
#!/usr/bin/env python3
2+
# -*- coding: utf-8 -*-
3+
4+
"""
5+
:file: db_backup.py
6+
:language: python3
7+
:author: Peter Bailie (Systems Programmer, Dept. of Computer Science, RPI)
8+
:date: May 22 2018
9+
10+
This script will take backup dumps of each individual Submitty course
11+
database. This should be set up by a sysadmin to be run on the Submitty
12+
server as a cron job by root. Recommend that this is run nightly.
13+
14+
The term code can be specified as a command line argument "-t".
15+
The "-g" argument will guess the semester by the current month and year.
16+
Either -t or -g must be specified.
17+
18+
Dumpfile expiration can be specified as a command line argument "-e". This
19+
indicates the number of days of dumps to keep. Older dumps will be purged.
20+
Only old dumps of the semester being processed will be purged. Argument value
21+
must be an unsigned integer 0 - 999 or an error will be issued. "No expiration"
22+
(no files are purged regardless of age) is indicated by a value of 0, or when
23+
this argument is ommitted.
24+
25+
WARNING: Backup data contains sensitive information protected by FERPA, and
26+
as such should have very strict access permissions.
27+
28+
Change values under CONFIGURATION to match access properties of your
29+
university's Submitty database and file system.
30+
"""
31+
32+
import argparse
33+
import datetime
34+
import os
35+
import re
36+
import subprocess
37+
import sys
38+
39+
# CONFIGURATION
40+
DB_HOST = 'submitty.cs.myuniversity.edu'
41+
DB_USER = 'hsdbu'
42+
DB_PASS = 'DB.p4ssw0rd' # CHANGE THIS! DO NOT USE 'DB.p4ssw0rd'
43+
DUMP_PATH = '/var/local/submitty-dumps'
44+
45+
def delete_obsolete_dumps(working_path, expiration_stamp):
46+
"""
47+
Recurse through folders/files and delete any obsolete dump files
48+
49+
:param working_path: path to recurse through
50+
:param expiration_stamp: date to begin purging old dump files
51+
:type working_path: string
52+
:type expiration_stamp: string
53+
"""
54+
55+
# Filter out '.', '..', and any "hidden" files/directories.
56+
# Prepend full path to all directory list elements
57+
regex = re.compile('^(?!\.)')
58+
files_list = filter(regex.match, [working_path + '/{}'.format(x) for x in os.listdir(working_path)])
59+
re.purge()
60+
61+
for file in files_list:
62+
if os.path.isdir(file):
63+
# If the file is a folder, recurse
64+
delete_obsolete_dumps(file, expiration_stamp)
65+
else:
66+
# File date was concat'ed into the file's name. Use regex to isolate date from full path.
67+
# e.g. "/var/local/submitty-dumps/s18/cs1000/180424_s18_cs1000.dbdump"
68+
# The date substring can be located with high confidence by looking for:
69+
# - final token of the full path (the actual file name)
70+
# - file name consists of three tokens delimited by '_' chars
71+
# - first token is exactly 6 digits, the date stamp.
72+
# - second token is the semester code, at least one 'word' char
73+
# - third token is the course code, at least one 'word' char
74+
# - filename always ends in ".dbdump"
75+
# - then take substring [0:6] to get "180424".
76+
match = re.search('(\d{6}_\w+_\w+\.dbdump)$', file)
77+
if match is not None:
78+
file_date_stamp = match.group(0)[0:6]
79+
if file_date_stamp <= expiration_stamp:
80+
os.remove(file)
81+
82+
def main():
83+
""" Main """
84+
85+
# ROOT REQUIRED
86+
if os.getuid() != 0:
87+
raise SystemExit('Root required. Please contact your sysadmin for assistance.')
88+
89+
# READ COMMAND LINE ARGUMENTS
90+
# Note that -t and -g are different args and mutually exclusive
91+
parser = argparse.ArgumentParser(description='Dump all Submitty databases for a particular academic term.')
92+
parser.add_argument('-e', action='store', nargs='?', type=int, default=0, help='Set number of days expiration of older dumps (default: no expiration).', metavar='days')
93+
group = parser.add_mutually_exclusive_group(required=True)
94+
group.add_argument('-t', action='store', nargs='?', type=str, help='Set the term code.', metavar='term code')
95+
group.add_argument('-g', action='store_true', help='Guess term code based on calender month and year.')
96+
args = parser.parse_args()
97+
98+
# Get current date -- needed throughout the script, but also used when guessing default term code.
99+
# (today.year % 100) determines the two digit year. e.g. '2017' -> '17'
100+
today = datetime.date.today()
101+
year = str(today.year % 100)
102+
today_stamp = '{:0>2}{:0>2}{:0>2}'.format(year, today.month, today.day)
103+
104+
# PARSE COMMAND LINE ARGUMENTS
105+
expiration = args.e
106+
if args.g is True:
107+
# Guess the term code by calendar month and year
108+
# Jan - May = (s)pring, Jun - July = s(u)mmer, Aug - Dec = (f)all
109+
# if month <= 5: ... elif month >=8: ... else: ...
110+
semester = 's' + year if today.month <= 5 else ('f' + year if today.month >= 8 else 'u' + year)
111+
else:
112+
semester = args.t
113+
114+
# GET ACTIVE COURSES FROM 'MASTER' DB
115+
try:
116+
sql = "select course from courses where semester='{}'".format(semester)
117+
# psql postgresql://user:password@host/dbname?sslmode=prefer -c "COPY (SQL code) TO STDOUT"
118+
process = "psql postgresql://{}:{}@{}/submitty?sslmode=prefer -c \"COPY ({}) TO STDOUT\"".format(DB_USER, DB_PASS, DB_HOST, sql)
119+
result = list(subprocess.check_output(process, shell=True).decode('utf-8').split(os.linesep))[:-1]
120+
except subprocess.CalledProcessError:
121+
raise SystemExit("Communication error with Submitty 'master' DB")
122+
123+
if len(result) < 1:
124+
raise SystemExit("No registered courses found for semester '{}'.".format(semester))
125+
126+
# BUILD LIST OF DBs TO BACKUP
127+
# Initial entry is the submitty 'master' database
128+
# All other entries are submitty course databases
129+
course_list = ['submitty'] + result
130+
131+
# MAKE/VERIFY BACKUP FOLDERS FOR EACH DB
132+
for course in course_list:
133+
dump_path = '{}/{}/{}/'.format(DUMP_PATH, semester, course)
134+
try:
135+
os.makedirs(dump_path, mode=0o700, exist_ok=True)
136+
os.chown(dump_path, uid=0, gid=0)
137+
except OSError as e:
138+
if not os.path.isdir(dump_path):
139+
raise SystemExit("Failed to prepare DB dump path '{}'{}OS error: '{}'".format(e.filename, os.linesep, e.strerror))
140+
141+
# BUILD DB LISTS
142+
# Initial entry is the submitty 'master' database
143+
# All other entries are submitty course databases
144+
db_list = ['submitty']
145+
dump_list = ['{}_{}_submitty.dbdump'.format(today_stamp, semester)]
146+
147+
for course in course_list[1:]:
148+
db_list.append('submitty_{}_{}'.format(semester, course))
149+
dump_list.append('{}_{}_{}.dbdump'.format(today_stamp, semester, course))
150+
151+
# DUMP
152+
for i in range(len(course_list)):
153+
try:
154+
# pg_dump postgresql://user:password@host/dbname?sslmode=prefer > /var/local/submitty-dump/semester/course/dump_file.dbdump
155+
process = 'pg_dump postgresql://{}:{}@{}/{}?sslmode=prefer > {}/{}/{}/{}'.format(DB_USER, DB_PASS, DB_HOST, db_list[i], DUMP_PATH, semester, course_list[i], dump_list[i])
156+
return_code = subprocess.check_call(process, shell=True)
157+
except subprocess.CalledProcessError as e:
158+
print("Error while dumping {}".format(db_list[i]))
159+
print(e.output.decode('utf-8'))
160+
161+
# DETERMINE EXPIRATION DATE (to delete obsolete dump files)
162+
# (do this BEFORE recursion so it is not calculated recursively n times)
163+
if expiration > 0:
164+
expiration_date = datetime.date.fromordinal(today.toordinal() - expiration)
165+
expiration_stamp = '{:0>2}{:0>2}{:0>2}'.format(expiration_date.year % 100, expiration_date.month, expiration_date.day)
166+
working_path = "{}/{}".format(DUMP_PATH, semester)
167+
168+
# RECURSIVELY CULL OBSOLETE DUMPS
169+
delete_obsolete_dumps(working_path, expiration_stamp)
170+
171+
if __name__ == "__main__":
172+
main()

nightly_db_backup/readme.md

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# Nightly Database Backup Python Script
2+
Readme June 26, 2018
3+
4+
### db_backup.py
5+
6+
This script will read a course list, corresponding to a specific term, from
7+
the 'master' Submitty database. With a course list, the script will use
8+
Postgresql's "pg_dump" tool to retrieve a SQL dump of the submitty 'master'
9+
database and each registered course's Submitty database of a specific semester.
10+
The script also has cleanup functionality to automatically remove older dumps.
11+
12+
*db_backup.py is written in Python 3, and tested with Python 3.4.*
13+
14+
---
15+
16+
The term code can be specified as a command line argument as option `-t`.
17+
18+
For example:
19+
20+
`python3 ./db_backup.py -t f17`
21+
22+
will dump the submitty 'master' database and all courses registered with term
23+
code `f17` (Fall 2017). This option is useful to dump course databases of
24+
previous term, or to dump course databases that have a non-standard term code.
25+
26+
Alternatively, the `-g` option will have the term code guessed by using the
27+
current month and year of the server's date.
28+
29+
The term code will follow the pattern of TYY, where
30+
- T is the term
31+
- **s** is for Spring (Jan - May)
32+
- **u** is for Summer (Jun - Jul)
33+
- **f** is for Fall (Aug-Dec)
34+
- YY is the two digit year
35+
- e.g. April 15, 2018 will correspond to "s18" (Spring 2018).
36+
37+
`-t` and `-g` are mutually exclusive.
38+
39+
---
40+
41+
Each dump has a date stamp in its name following the format of "YYMMD",
42+
followed by the semester code, then the course code.
43+
44+
e.g. '180423_s18_cs100.dbdump' is a dump taken on April 23, 2018 of the Spring
45+
2018 semester for course CS-100.
46+
47+
Older dumps can be automatically purged with the command line option "-e".
48+
49+
For example:
50+
51+
`python3 ./db_backup.py -t f17 -e 7`
52+
53+
will purge any dumps with a stamp seven days or older. Only dumps of the
54+
term being processed will be purged, in this example, 'f17'.
55+
56+
The default expiration value is 0 (no expiration -- no files are purged) should
57+
this argument be ommitted.
58+
59+
---
60+
61+
Submitty databases can be restored from a dump using the pg_restore tool.
62+
q.v. [https://www.postgresql.org/docs/9.5/static/app-pgrestore.html]
63+
(https://www.postgresql.org/docs/9.5/static/app-pgrestore.html)
64+
65+
This is script intended to be run as a cronjob by 'root' on the same server
66+
machine as the Submitty system. *Running this script on another server other
67+
than Submitty has not been tested.*
68+
69+
---
70+
71+
Please configure options near the top of the code.
72+
73+
DB_HOST: Hostname of the Submitty databases. You may use 'localhost' if
74+
Postgresql is on the same machine as the Submitty system.
75+
76+
DB_USER: The username that interacts with Submitty databases. Typically
77+
'hsdbu'.
78+
79+
DB_PASS: The password for Submitty's database account (e.g. account 'hsdbu').
80+
**Do NOT use the placeholder value of 'DB.p4ssw0rd'**
81+
82+
DUMP_PATH: The folder path to store the database dumps. Course folders will
83+
be created from this path, and the dumps stored in their respective course
84+
folders, grouped by semester.
85+
86+
---
87+
88+
WARNING: Database dumps can contain student information that is protected by
89+
[FERPA (20 U.S.C. § 1232g)]
90+
(https://www2.ed.gov/policy/gen/guid/fpco/ferpa/index.html). Please consult
91+
with your school's IT dept. for advice on data security practices and policies.
92+
93+
---
94+
95+
db_backup.py is tested to run on Python 3.4 or higher.
96+
97+
NOTE: Some modification of code may be necessary to work with your school's
98+
information systems.

0 commit comments

Comments
 (0)