Skip to content
Bryan Smith edited this page Jun 6, 2015 · 196 revisions

Home > Bryan's Notes

https://www.digitalocean.com/community/tutorials/how-to-serve-django-applications-with-uwsgi-and-nginx-on-ubuntu-14-04

TODO

Catch-all for miscellaneous stuff I'd like to get done

  • Make webapp and microdata independent applications

  • Change db password (remove password from source too)

  • Fill empty groupsets?

  • Lock down API endpoints

  • Database backups

  • Server deployment script

{
	"device":"/api/device-api/3/,
	"time": [start, frequency]
	"dataPoints":
		[
			{"wattage":400},
		]
}

5-7-2015

PG&E's Rate Plans: here's a library of rate plans that have exact information regarding prices and such. Will be useful for implementing Rate Plans per device.

http://www.pge.com/tariffs/ERS.SHTML#ERS

4-5-2015

Webserver

I focused my energy on the notification system today since up to this point it was all but nonexistent. When I started, the settings page was able to keep track of the user's notification preferences and save it in the User model for later. The following components needed to be worked out in order to get automated email delivery:

  • Email Service
  • Python API for Email Service
  • Email Templating

The Email Service is being handled by Amazon SES and was chosen since it was a product offered in the same suite and style as the Amazon EC2 Instance the web server is running on. This gives us the ability to pay as needed. With the SES console, I can view statistics on the email activity and set notifications as well.

The Python API suggested by Amazon is boto, which exposes a very easy to work with API for the Amazon SES layer:

import boto3

ses = boto3.client('ses')

ses.send_email(Source=Source, Destination=Destination, Message=Message)

And it's as simple as that. The one limitation with SES is the default email mode is Sandbox, meaning emails must be sent and delivered via "verified" email addresses, or ones that I own. This can be changed with a request however, and will be done before too long.

The email templates work in much the same way as an html template works for the rest of the website. A text file exists in the static directory that defines a template with context keywords ({{ keyword }}) built in. When this file is loaded, it can be converted into a template and populated with context:

from django.template import Context, Template

text = ""
with open('email_template.txt', 'r') as f:
    text = f.read()
template = Template(text)
context = Context({'username': 'admin', 'first_name': 'Bryan'})

template.render(context)

This will render the text as a template with the given context.

Putting this all together and I came up with a custom django command that can be run via manage.py by creating the following directory structure within the webapp:

webapp/
    __init__.py
    models.py
    management/
        __init__.py
        commands/
            __init__.py
            email_interval.py
    tests.py
    views.py

Within email_interval.py, we simply need to define a Command class that extends BaseCommand. You can find the documentation Here.

With the custom command in place, it can be executed as follows:

python manage.py email_interval <weekly, monthly>

This should be added to a cron job that runs on the specified interval. Running the command via manage.py will send an email to all users that are opted in to the specified interval.

3-4-2015 through 4-4-2015

3-3-2015

Webserver

Google Maps API

I added a map to the main dashboard display that shows all devices that a user owns. It is based on the Markers API within Google Maps. Each marker is assigned the location of a device based on the coordinates provided by the device itself:

{% for device in my_devices %}

var marker{{ device.serial }};

{% endfor %}


{% for device in my_devices %}

    var pos = new google.maps.LatLng({{ device.position }});
    LatLngList.push(pos);

    marker{{ device.serial }} = new google.maps.Marker({
        map: map,
        position: pos,
        title: '{{ device.name }}',
    });

{% endfor %}

The markers are instantiated before assignment to allow the ability to be referenced later by another function, for example we need to define an action when the user selects a marker from the map:

google.maps.event.addListener(marker{{ device.serial }}, 'click', function() {
    {% for device2 in my_devices %}
        infowindow{{ device2.serial }}.close();
    {% endfor %}
    infowindow{{ device.serial }}.open(map, marker{{ device.serial }});
    $.get('/charts/device/{{ device.serial }}/chart/',
        {
            'stack': true
        },
        function(data) {
            $('#ajax-container').html(data);
            render_chart();
            $('#chart-title').html("{{ device.name }} at a glance");
            $('#chart-title').val("{{ device.serial }}");
        }
    );
});

This custom behavior for a marker not only displays an info window, it also loads a new graph based on which device was selected. This is in line with the thought of keeping the dashboard fully interactive, where a user input has real consequences on the experience of the dashboard.

InfoWindow Loaded via AJAX

The following calculations are done to populate an InfoWindow with data:

  • Average usage: Sum of average series for all appliances divided by number of appliances:
for appliance in appliances:
    averages += db.query('select mean(wattage) from device.'+str(device.serial)+'.'+appliance)
averages /= len(appliances)
  • Current usage: Sum of all appliance series given the latest timestamp:
for appliance in appliances:
           this_wattage = db.query('select * from 1m.device.'+str(device.serial)+'.'+appliance)
           if this_wattage[0] > time.time() - 1000:
              current_wattage += this_wattage

The average wattage calculation is especially time consuming because it needs to calculate a mean value for all appliances the device is measuring. Because of this, the call to populate the data of in InfoWindow is done asynchronously and begins as soon as the InfoWindow is bound to a marker. This allows us to go get other pieces of the dashboard loaded while this computes. However, this means that an InfoWindow is not available straight away, since it needs time to calculate.

3-2-2015

Webserver

Today, we reached the cusp of 75% utilization on our EBS volume. Some backstory: this volume is what I used to install the webserver back in November when things were just starting to spin up. At the time, I selected the default volume size of 8GB. Obviously, this would not be enough in the long run, but it would save us some cash for the first few months when our instance would not need all that space.

Now that we have webpages and databases, we made our way up to 75% utilization. It was time to expand the volume. Thankfully we went with an EC2 instance through Amazon, which allows for the expanding of a volume in a simple manner. The steps are essentially as follows:

  1. Stop the instance
  2. Create a snapshot of the volume
  3. Apply snapshot to new volume of a larger size
  4. Detach old volume from instance
  5. Attach new volume to instance (/dev/sda1)
  6. Start instance

There are a few ways to verify the volume was in fact expanded, and those are outlined in the Amazon article linked above.

2-27-15

Webserver

Continuing with the development of the settings form, I created three more dynamic fields: Utility Company, Rate Plan, and Territory. These three together form a group of dropdowns on the device settings page. The following methods were added to webapp/models.py:

class UtilityCompany(models.Model):
   description = models.CharField(max_length=300)
   #TODO add model fields to describe actions

   def __unicode__(self):
      return self.description

class RatePlan(models.Model):
   description = models.CharField(max_length=300)
   #TODO add model fields to describe actions

   def __unicode__(self):
      return self.description

class Territory(models.Model):
   description = models.CharField(max_length=300)
   #TODO add model fields to describe actions

   def __unicode__(self):
      return self.description


class DeviceSettings(models.Model):
   device = models.OneToOneField(Device)
   utility_companies = models.ManyToManyField(UtilityCompany)
   rate_plans = models.ManyToManyField(RatePlan)
   territories = models.ManyToManyField(Territory)

The three classes UtilityCompany, RatePlan, and Territory are essentially the same class with different names. These are linked to a DeviceSettings class through the use of a ManyToManyField in the same fashion as a Notification is linked to a UserSettings.

From here, we populate the three generator classes with fields by giving the admin interface access in webapp/admin.py:

from webapp.models import UtilityCompany, RatePlan, Territory

admin.site.register(UtilityCompany)
admin.site.register(RatePlan)
admin.site.register(Territory)

The form itself is set up to initialize the value of the "Device Name" field with the name of the first device returned in the context of the page. Using the first device is how most of the form has functionality. I would like to get it to a place where the device selection dropdown actually as an effect on form functionality.

Inputs of the form

<input id="username" name="device-name" value="{{ devices.0 }}" type="text" class="form-control input-md">

Have been replaced with the template equivalent:

{{ form.new_name }}

For all form items that can be submitted.

On webapp/views.py, there lies the functions required to parse the form and apply the changes. Up until now, I was envisioning separate forms for separate functions, but I found it easier to just combine the forms into one massive form with all fields non required. In this way, the client can just send the whole form and they don't have to tell me what it's for (i.e. user account settings, device settings, dashboard settings).

2-26-15

Webserver

SSL is now the default mode of communication between clients and server. This was done by creating a self-signed certificate, linking it to port 443, telling uwsgi to serve from :443, and nginx to forward all traffic from :80 to :443. This was accomplished in these series of steps:

Generate certificate:

$ openssl genrsa -out server.key 4096
$ openssl req -new -key server.key -out server.csr
$ openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt

uwsgi.ini:

[uwsgi]

socket=/tmp/uwsgi.sock
chmod-socket=644
uid = www-data
gid = www-data
 
chdir=/home/ubuntu/seads-git/ShR2/Web Stack
module=seads.wsgi:application
pidfile=/home/ubuntu/seads.pid
vacuum = true

nginx.conf:

upstream django {
    server unix:///tmp/uwsgi.sock;
}
 
server {
    listen 80;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl;

    ssl_certificate	/home/ubuntu/server.crt;
    ssl_certificate_key	/home/ubuntu/server.key;
    error_log /home/ubuntu/nginxerror.log;
    location / {
        uwsgi_pass  django;
        include /etc/nginx/uwsgi_params;
        }
    
    location /static {
        autoindex on;
        alias "/home/ubuntu/seads/webapp/static/";
        }
}

By keeping the listen server at port 80, we can issue a 301 permanent redirect for web browsers connecting via HTTP to redirect to HTTPS.

As great as this is, it's causing issues with the ESP8266, which was previously relying on HTTP. It looks like in the newest SDK (0.9.5), there is a espconn_secure_connect() which should replace our espconn_connect().

2-25-15

Webserver

  • Admin Interface

Today I continued progress on the automatically generated form. I really like the idea of having the form controlled through the admin interface, allowing the possibility of a different form for every person. The Django admin interface makes it really easy to customize, which starts by scrubbing each admin.py file in each application by calling admin.autodiscover() in urls.py. This initiates the generation of the admin interface.

In each application, I've placed code pertinent to what needs to be reflected in the admin interface. For instance, in the Microdata application, the admin interface looks like this:

admin.py

from django.contrib import admin

from microdata.models import Device
from webapp.models import DeviceSettings
# Register your models here.

class DeviceSettingsInline(admin.StackedInline):
	model = DeviceSettings
	can_delete = False
	verbose_name_plural = 'devicesettings'

class DeviceAdmin(admin.ModelAdmin):
   list_display = ('name','owner','serial','position','secret_key','registered','fanout_query_registered',)
   search_fields = ('name','serial')
   readonly_fields=('secret_key',)
   inlines = (DeviceSettingsInline,)

admin.site.register(Device, DeviceAdmin)

This file encapsulates most of what goes on in the other admin files: a StackedInline paired with a ModelAdmin. This allows the modification of related models in the Device form:

Image

  • Settings Page

Continued progress on getting functionality built into the page. List of things completed vs uncompleted:

Completed:

  • Account
    • Change username
    • Change password
    • Notification preferences rendering
    • Add devicve
  • Device
    • Device name
    • Connection status
    • Change location

Uncompleted:

  • Account
    • Notification preferences selection
  • Device
    • Remove device
    • Utility information render and selection
  • Dashboard
    • All

2-23-15

Webserver

I ran into some issues that need to be worked out today regarding real-time data and timezone mismatching. Here is what I know thus far:

The real-time updating works on all resolutions except 1s (the most interesting). From inspecting the POST request sent by the client, all resolutions EXCEPT 1s formats the timestamps in correct local time, while the 1s resolution sends the local time as GMT time:

Minute resolution:

http://seads.brabsmit.com/charts/device/1/?unit=m&from=1424810272&to=1424810332

Second resolution:

http://seads.brabsmit.com/charts/device/1/?from=1424781420&to=1424781540&unit=s

From what I can tell, the chart calls the API endpoint itself with an incorrect timestamp, irrespective of the realTimeData() function I created which actually returns valid points. The distinction is that the function will format the request with the unit at the end, while the chart formats it with the unit at the beginning.

2-21-15

Webserver

I went into the Settings form page with the idea of generating a semi-dynamic settings page that can be managed via an administrator interface. This means that I'm giving the administrator the opportunity to customize a specific user or set of users' settings interface. For example, if in the future we wanted to break users into two categories, i.e. free and paid, then one would certainly have more functionality than the other.

In that regard, I've started by creating two new models to handle Notification Preferences:

class Notification(models.Model):
   description = models.CharField(max_length=300)

   def __unicode__(self):
      return self.description

class UserSettings(models.Model):
   user = models.OneToOneField(User)
   notifications = models.ManyToManyField(Notification)

A UserSettings class will be linked to a user through the use of a OneToOneField. As of right now, the Notification class is the only customizable UserSetting, which is a ManyToManyField. The Notification class is registered uniquely in the admin interface, giving the administrator the opportunity to add/remove Notifications.

The UserSettings class, since it is linked to the user, is registered under the User model in the admin interface, giving the ability to link/unlink given Notifications relevant to the user. Giving a user a UserSettings class is done manually through the admin interface at this time.

With this new dynamic field, the form needs to be dynamic itself:

class UserForm(forms.Form):
  new_username = forms.CharField(max_length=254,
                                 min_length=1,
                                 required=False,
                                 widget=forms.TextInput(attrs={
                                    'class' : 'form-control input-md'}))
  error_messages = {
    'password_mismatch': ("The two password fields didn't match."),
  }
  password1 = forms.CharField(label=("Password"),
      widget=forms.PasswordInput(attrs={'class' : 'form-control input-md'}),
      required=False)
  password2 = forms.CharField(label=("Password confirmation"),
      widget=forms.PasswordInput(attrs={'class' : 'form-control input-md'}),
      help_text=("Enter the same password as above, for verification."),
      required=False)
  notifications = forms.ChoiceField(
      widget=forms.CheckboxSelectMultiple(),
      # Default choices
      choices=(
         ("1", "Don't send any email"),
         ("2", "Weekly consumption details"),
         ("3", "Monthly consumption details"),
         ("4", "When we detect irregular household consumption"),
         ("5", "When we detect an irregular device"),
        ),
      required=False
  )
  choices = []

  def __init__(self, *args, **kwargs):
    user = kwargs.pop('user', None)
    super(UserForm, self).__init__(*args, **kwargs)
    if user:
      i = 1
      for choice in user.usersettings.notifications.all():
        self.choices.append((str(i), choice))
        i += 1
      self.fields['notifications'] = forms.ChoiceField(
        widget=forms.CheckboxSelectMultiple(),
        choices=(
           self.choices
          ),
        required=False
  )

Several methods are at work here that should be covered:

  • Default constructor: if a form is initialized with empty arguments form = UserForm(), a default set of notifications will be generated. The default constructor will most likely never be run.
  • Custom constructor: if the user kwarg is provided form = UserForm(user=user), we can populate the notifications field with the provided notifications relevant to the user: user.usersettings.notifications.all().

2-19-15

Webserver

I continued to add to the settings forms to get them into a state that I can start to fill in the functionality on the backend. The Settings page is broken down into three categories, and can be switched between with the nav menu on the left hand side of the screen. For now, I made the settings page sit in the page-container div surrounded by the navbar and the sidebar, but I may choose to forgo the sidebar since it is not really part of the dashboard per se.

Image

As of right now, mostly everything on the forms is non functional since the corresponding backend actions are not yet completed. Next step is to fill in the functionality on the User form as well as the Device form. I will not yet fill out the Dashboard form until we have a better idea of what needs to go there.

  • Device Status

The one piece of functionality I did build was the device status interface, which is a button that lives on the Devices page:

Image

The device status is determined by examining the newest data point from the device in the InfluxDB table, "device.". We are making an assumption here that every device will transmit regularly. Therefore, if the newest point is older than 10 seconds, we render the device offline. Clicking the button will check the device status again.

Client:

$('#btn-status').on('click', function(e) {
    $.get('/settings/device/status/',
        {'serial': {{ devices.0.serial }}},
        function(status) {
            if (status.connected) {
                // Change device status button to "Connected"
             } else {
                // Change device status botton to "Disconnected"
             }
        }
    });
});

Server:

def device_status(request):
    connected = False
    if request.method == 'GET':
        serial = request.GET.get('serial', None)
        if (serial):
            device = Object(serial)
            connected = device_is_online(device)
    context = {}
    context['connected'] = connected
    return HttpResponse(json.dumps(context), content_type="application/json") 

2-18-15

Webserver

I tackled some repetitive yet necessary work today relating to the settings page. Much needs to happen on this page, as it is the main hub for all controls that we hand over to the user. I put some thought into the layout and decided on the following:

Three main groups: User, Device, and Dashboard. Within the three groups, the settings are flushed out in their respective group.

- user
-- add device
-- change username
-- change password
-- update notification preferences

- device
-- device name
-- device status (readonly)
-- device location
-- utility company
-- rate plan
-- territory
-- remove device

- dashboard
-- default chart type
-- default time range
-- default resolution

For the device status, we can assume the device is connected if it has entered data into the database less than 10 seconds before now:

def device_is_connected(db, device):
    result = db.query('select * from device.'+str(device.serial)+' limit 1;')[0]['points'][0]
    return int(time.time()) - int(result[0]) < 10

2-17-15

Webserver

I backtracked today by taking the work of rendering a chart off the server side. I did this because the gmapi module for django was incomplete, it did not implement all aspects of the Google Maps API. Therefore, I had to move to a full Javascript implementation to get the functionality we need. For example, the python implementation did not implement Places, which we use for autocomplete on a search box:

var input = /** @type {HTMLInputElement} */(
      document.getElementById('pac-input'));
map.controls[google.maps.ControlPosition.TOP_LEFT].push(input);

var autocomplete = new google.maps.places.Autocomplete(input);
autocomplete.bindTo('bounds', map);

As of right now, the debug map has the ability to pinpoint the user's current location through HTML5, offer a search box with autocomplete, and the ability to center the map on the searched location. Next step would be to get the marker's location and store it in the device's location:

marker.getPosition()

mf {k: 37.0009446, D: -122.0628405, toString: function, j: function, equals: function…}

I've added the functionality to the "Add Device" modal on the dashboard, with a few bugs:

  • Autocomplete making requests but not rendering
  • Unable to reference the marker on the map

Since Google Maps API uses document.write() to embed javascript, special care had to be taken with the modal to take into account asynchronous requests:

$(function() {
  var script = document.createElement('script');
  script.type = 'text/javascript';
  script.src = 'https://maps.googleapis.com/maps/api/js?v=3.exp' +
      '&signed_in=true&callback=initialize';
  document.body.appendChild(script);
});

2-15-15

Webserver

  • Google Maps

I added Geolocation to the map today by taking advantage of the Places Library available in the API. With this, the user is able to search for the location of their device with bias set on the current viewport. Meaning, the autocomplete search field will suggest locations closest to the user first before becoming more desperate and searching the rest of the world.

This was done by adding an event listener to the search box, and triggering it on focus. The trigger calls the Autocomplete function of Google maps, giving a small list of options for the user to select. As the user enters more of the address or place, the Autocomplete becomes more accurate.

Image

In addition, I've made the marker that is placed on the map at the user's current location draggable. The idea being, the marker should be dragged to the location of the seads device, and the server will save that location. This was done by modifying the marker options on the backend python debug/views.py:

marker = maps.Marker(opts = {
        'map': gmap,
        'position': maps.LatLng(lat,lon),
        'draggable':True,
        'title':"Drag me!"
   })

Since I'm doing the chart rendering and serving on the server side, I'm missing out on a lot of functionality that could be useful in the future, such as an integrated Search Bar which would make the map look more appealing. I am going to look into adding functionality to the python code to see if I can get something working.

2-13-15

Webserver

Hotfixed the static files issue by reworking some config variables until it worked. I really need to look into this issue, which spawned when I moved the project into the GitHub repository.

With the static files working, the map now renders itself as a javascript object allowing interactivity. On top of that, I'm also using location services with IP and HTML5 methods. The workflow for the map is as follows:

  1. User requests the map
  2. A map is generated at the center of the coordinates determined by their IP
  3. The page is served with a rendered map
  4. The page asks for the users location through HTML5
  5. If granted, a new chart loads with a more accurate position, and a marker is placed at that location

Determine user's general area through IP:

g = GeoIP(path='/home/ubuntu/seads-git/ShR2/Web Stack/webapp/static/webapp/dat/')
ip = request.META.get('REMOTE_ADDR', None)
if ip:
    lat,lon = g.lat_lon(ip)

Then refine using coordinates supplied by HTML5:

lat = float(request.POST.get('lat', 0))
lon = float(request.POST.get('lon', 0))

Geolocation through the browser is very highly documented in a Dive Into HTML5 Article.

The reason why we do this is because eventually, I need a map that the user can interface with to tell us where their device is. If the map is generated at their current location, then that's a pretty good guess as to where the device is in the world. From that, the user should be able to input a more specific location with the search bar provided, or just use what was generated.

A Map is generated using the django-gmapi API interface:

gmap = maps.Map(opts = {
        'center':maps.LatLng(lat,lon),
        'mapTypeId':maps.MapTypeId.ROADMAP,
        'zoom':15,
        'mapTypeControlOptions':{
           'style': maps.MapTypeControlStyle.DROPDOWN_MENU
        },
   })

And can be served to the page by JSON serialization and deserialization by a jquery plugin.

Image

2-12-15

Webserver

Henry brought up a good point today about how if a device needs to be deleted, all accompanying data points and queries in the database should be deleted with it. Since our database is now a third-party extension to Django, we must do this manually.

This is possible by overriding Django's built in model classes just like we did for the save() method:

def delete(self, *args, **kwargs):
    # custom delete code here
    super(Device, self).delete()

Two things need to happen with the database when deleting a device:

  1. Delete all series with which the device is associated with, and
  2. Delete all continuous queries.

Location Services

I started playing around with location based services that Django has to offer for our server. It is becoming necessary to set up some sort of framework that allows a user to pinpoint the location of their device, and quite frankly a zipcode is not good enough. If we could get the user to pinpoint the location on a map, then we have access to much more data about the location that zipcode could not supply.

First iteration:

At first I explored GeoDjango but ran in to trouble with the building of the source and dependencies. After re-evaluating, I realized that something this complex and intensive is not necessary for what we are trying to do.

In that spirit, I found a Django addon called django-gmapi that is easily installable. As of right now, I've only been able to generate a map centered at given coordinates and displayed on a webpage:

Image

I'm running into static file serving problems, will address tomorrow.

ESP8266

Henry and I talked through some conceptual challenges that we may face in future development with the ESP module:

  • Device pairing: how are we going to link the physical device and its data to a user on the website?

    • I suggested having a random key sent to the device from the server once the device is registered. However, this method requires the device to be responsive to the return packet and be able to, after registering, render the code to the user.

    • Another option is to have the device itself generate the random key and send it to the server when registering. That way, the ESP can render its key to the user before it reconfigures itself into STA mode only.

I've already developed the pairing key algorithm on the server side to account for the first option. However, given those two methods we discussed, we decided to move forward with the latter in favor of keeping the complexity of the ESP development down. I can always compensate for the ESP's lack of computing power on the server side. It will not be too difficult to switch to the second method.

2-10-15

Webserver

  • Real-time data

Work continued as usual today on the Load Disaggregation visualization. Today I focused on getting the data to be served in real time to provide a feeling of some sort of a live dashboard. I was reasonably successful in coming up with an efficient and scalable solution that does not involve a constant connection to the webserver by the client.

On Zoomcharts, it is very easy to add new data to an existing chart. Working off of This Example, adding data to a chart is almost trivial:

chart.addData({unit: unit, values: data});

In this way, data can be added incrementally at a set interval. I have set it up in such a way that, given the user's set resolution, the data will refresh at that frequency. For example, if the user set the resolution to 5 minutes, then the data will refresh every 5 minutes. Likewise if it were set at one second, then a new set of data points will arrive every second. This was done by creating a function that, once complete, sets a timeout on itself:

// data refresh pseudocode
// assumes chart resolution of 5s

function refresh() {
    updateInterval = 5000;
    start = Date.parse('now - 5 seconds');
    stop = Date.parse('now')
    $.get('/charts/device/{{ my_devices.serial.0 }}/?unit=5s&from='+start+'&to='+stop, function(ajax_data) {
        chart.addData({unit: '5s', values: ajax_data.data});
    });
    setTimeout(refresh, updateInterval);
}

A few things to note about this algorithm:

  1. Update interval will only be re-evaluated at the time of the function call. This means that, if the user were to increase the resolution, then the function will not be called until the end of the interval set previously by the lower resolution. This could be fixed in the future (TODO).

  2. The chart will always be updated with data from the first device that the owner is paired to. This is how most of the charting is set up right now, so it's making the assumption that the owner is only paired to a single device. This should be fixed when I add multi-device support to the dashboard (TODO).

  3. I'm using date.js to determine start and stop intervals, hence the unique Date parsing function. It is very useful to be able to have intervals that can be parsed into time values, since the chart itself will return an interval such as "5m, 1h, 1d, 1s, etc...".

2-9-15

Today I took a detour on our project to explore transient analysis on a power grid at a high level. I did this because for a brief moment we played with the idea of maybe using our sead devices as monitors on the grid that can detect power line faults in an effective way.

Reading Teddy's paper on Power Line Fault Detection, it is clear that in order to do this type of study, there needs to be a seads device at every node in the network to effectively monitor the grid:

Image

This network, with a monitoring device at every node, can produce the following data following an A-to-ground fault:

Image

Because of this, and after talking with Paul, we detracted from this venture and put our eyes back on the prize.

ESP8266

The location field for a seads device as of right now is a zip code. This is a pretty bad way to go about location grouping if we want to get any sense of regional data that is more granular than at a city-wide level. I think for starting out, having not more than five or so devices, we should make the location field much more precise by asking for a physical address of the device where we can later go and look up coordinates relative to that address. I've added this item to the TODO.

2-5-15

I installed Grafana on the webserver today to get a better view of the data in the database:

Image

I can customize the charts by adding queries that are interpreted as timeseries values. A helpful feature of Grafana is the automatic refresh, which allows me to see the events as they are entered into the database or for when a continuous query is running.

Which brings up another point: I found out today that the continuous queries run by InfluxDB are special, as in they are not as complex as they claim to be. As their documentation points out, InfluxDB continuous queries will run on the top of whatever interval is specified for the continuous query. This means that a query like this:

select mean(wattage) from /^device.1.*/ group by time(1m) into 1m.:series_name

Will wait until the top of the minute, go back in the series and take the mean from the last 60 seconds and store it in the smoothed table. The limitation here is that the data needs to be present in the database when the calculation for that specific minute is made, or else it won't be added into the table. For this reason, if the continuous query was already defined and data that is older than what the database has already processed, it will not backfill the smoothed table.

As a workaround, dropping the continuous query and adding it back will trigger a backfill. But this is not a permanent solution. There may be a manual query that can be run on data that is known to be outside of the calculation window.

On an exciting note, I added a logo to the homepage in the Jumbotron. It makes the site feel a little more legitimate. My hope is to get a higher resolution image for later presentations. Perhaps I can vectorize this logo, seems simple enough.

I made changes to the pseudodata.py script to bring it back up to date regarding recent API changes. I removed the authentication token header, made the data more random, and added checking for errors in the POST. It will now error out if a request did not return 201 (Created).

As far as Zoomcharts goes, it is still holding strong against my specific needs. A setting I discovered today: noDataPolicy: (join) can be altered such that at any point in time there is no data, the chart will not draw a line connecting it. This was one thing that was mentioned long ago to me on a similar project that should be done, since joining two disconnected sets is essentially like interpolating data that we do not have.

2-3-15

This is how the timeseries database should react to devices:

device.<serial> (raw)

| time | appliance  | wattage |
|------|------------|---------|
| 1    | "Computer" | 100     |
| 1    | "Toaster"  | 20      |
| 2    | "Computer" | 120     |
| 2    | "Toaster"  | 15      |
| ...  | ...        | ...     |

Old way: 'select * from device.<serial> WHERE appliance = appliance_name'



Fanout (per column)

device.<serial>.Computer            device.<serial>.Toaster

| time | wattage |                  | time | wattage |
|------|---------|                  |------|---------|
| 1    | 100     |                  | 1    | 20      |
| 2    | 120     |                  | 2    | 15      |
| ...  | ...     |                  | ...  | ...     |

New way: 'select * from device.<serial>.appliance_name'


Fanout (time grouping)

1m.device.<serial>.Computer, 1s.device.<serial>.Computer, ...

1m.device.<serial>.Toaster, 1s.device.<serial>.Toaster, ...

Ideally, we take advantage of what the timeseries database has going for it, which is time grouping. When we do this, we can segregate the data into their own columns that are related back to the raw table:

Image

Image Link

2-2-15

Webserver

Generating aggregate queries on raw data per request is very costly. For this reason, we can create Continuous Queries that will run on in the background. With a continuous query, we can create series that relate to raw data that help to flush it out and precalculate values.

For instance, in this "schema", we can have a table such as this:

|    time    | appliance  | wattage |
|------------|------------|---------|
| 1422658552 | Computer   |   100   |
| 1422658552 | Toaster    |   20    |
| 1422658552 | Television |   60    |
|    ...     |    ...     |   ...   |

In this schema, querying select * will return a nonsensical dataset:

{
"points": [1422658552, "Computer", 100], [1422658552, "Toaster", 20], [1422658552, "Television", 60],
"columns": ["time", "appliance", "wattage"]
}

This queryset can be reworked into something that the charting library can accept, however we don't want to be reworking the dataset ever time the query is run. Therefore, it would make sense to fan out the appliances into their own series, so that each of these fanned-out series have just two columns: ["time","wattage"] with the appliance being the series name device.<serial>.Computer.

This can be done with defining a continuous query:

select * from device.<serial> into device.<serial>.[appliance]

Giving us something that resembles this:

Image

Running select * from series on any of these will return the timestamp/wattage pairs for the queried appliance only.

In addition, a continuous query can be defined that performs time grouping (data smoothing) on the raw data:

select mean(wattage) from /^device.<serial>.*/ group by time (1m) into 1m.:series_name

Where series_name is a value interpolated by the interpreter. This will fan out the series even further by performing time grouping on each appliance of each device.

When a new device is made, these two continuous queries need to be defined:

# device.serial = x

db.query('select * from device.x into device.x.[appliance]')
db.query('select mean(wattage) from /^device.x.*/ group by time(1m) into 1m.:series_name')
db.query('select mean(wattage) from /^device.x.*/ group by time(5m) into 5m.:series_name')
...

Defining these queries first will make sure there is no backlog. Continuous queries in InfluxDB do not add historical data that is in the backlog, so these need to be defined before any data exists in the series it is querying.

1-31-15

Webserver

Zoomcharts is very API-friendly in that I can specify multiple data sources and different endpoints for different resolutions:

The Data URL Example proved very useful in setting up queries to generate data of a given resolution:

chart = new TimeChart({
    container: document.getElementById("line-chart"),
    data:
    {
        units:["y","M","d","h","m","s"],
        timestampInSeconds: true,
        urlByUnit:{
            "y":"/charts/device/{{ my_devices.0.serial }}/",
            "M":"/charts/device/{{ my_devices.0.serial }}/",
            "d":"/charts/device/{{ my_devices.0.serial }}/",
            "h":"/charts/device/{{ my_devices.0.serial }}/",
            "m":"/charts/device/{{ my_devices.0.serial }}/",
            "s":"/charts/device/{{ my_devices.0.serial }}/",
        }

By default, the graph is loaded with the following parameters:

area: {
    initialDisplayUnit: "5 m",
    initialDisplayPeriod: "1 d",
    initialDisplayAnchor: "newestData"
}

With this configuration, the chart will initialize with a resolution of 5 minutes for the last 24 hours. This results in a call to the API: /charts/device/<serial>?unit=m where m can be any member of the set ["y","M","d","h","m","s"], depending on the initial resolution.

When this call is made, the chart will work with the given data regardless of the resolution the user chooses unless it is of a higher resolution than currently downloaded. If the user requests a higher resolution, the new set of data is downloaded and applied to the graph.

A couple complications arise in the framework for the chart:

  • Appliances (series) are generated one-off when the chart is created. It is done by retreiving the first result of the query: select * from "<serial>" limit 1. This will select the first (newest) timestamp:[values] and retrieve the columns of this result. Therefore, if the columns of this result are do not reflect all the columns in the current dataset, the chart will throw an error. For now, it's fine, since I have full control over the data and data coming from the device is limited to the "Unknown" appliance for now. But in the future, I will need to address this.

Another concern is the fact that the series labels do not always match the data they represent. This has to do with how the series are initially determined, discussed in the previous paragraph.

Data grouping

On the server side, data grouping is done entirely through queries to the database:

select mean(column) from series group by time(10m);

And has been documented here.

Another feature of the database is its native ability to grab date ranges from a selection of data. If I wanted the mean values of the data over 10 minute intervals for the last 24 hours, I could say:

select mean(column) from series group by time(10m) where time > now() - 1d;

Where now() is a relative function relating to the exact time when the query is executed.

Since selecting the mean involves singling out an individual column, getting an aggregate of all the columns for Zoomcharts requires calling mean for all columns in the set and aggregating them into one set. This is a very time-intensive operation, taking around 4 seconds for a day's worth of data.

Judging by performance analysis, it looks like most of the time is spent in calculating the query. I'll have to look into shards and continuous queries so that tables can be automatically generated in the database downtime that will contain aggregate calculations of the data for the given resolutions. That way, the data is ready to go and does not have to be smoothed upon every request.

Also in the backend, I made changes to the Event API to allow us to store the data points in the new database. I did this by overriding the save() method as such:

class Event(models.Model):
    ...
   
    def save():
        db = influxdb.InfluxDBClient('localhost',8086,'root','root','seads')
        data = []
        query = {}
        query['points'] = [[self.timestamp, self.wattage]]
        query['name'] = str(self.device.serial)
        query['columns'] = ['time', appliance[0].name]
        data.append(query)
        db.write_points(data)
        #super(Event, self).save()

The call to save() is commented out, since we don't want to store the actual event in Django's database.

1-30-15

Webserver

In reference to this post about performance issues we are facing with data retrieval, I've decided to overhaul the database system for storing data points. Doing some research on the type of data the database will hold, I settled on InfluxDB, an open source, time series centered database backend that has integration with Python.

The big advantage of this database is the graphical admin interface (seads.brabsmit.com:8083) that allows for queries to be tested on-the-fly with existing data in the database. With its SQL-like language, the transition to this system should be nearly seamless. In addition, the database is schemaless, meaning columns can be added to existing series without any penalty. This will come in handy when we start developing the algorithms that detect appliances that are on.

For now, the API endpoints have been disabled while this maintenance persists. I don't expect this process taking longer than the weekend to complete because of how similar it is.

Preliminary benchmarks with random data generation put this backend at 2 million data points per second per query. This includes the time it takes the server to generate the random data in addition to the time it takes the database to store the points. I suspect storing pre-calculated values to be even faster.

On the backend, the InfluxDB database can be interfaced with as follows:

from influxdb import client as influxdb

db = influxdb.InfluxDBClient(host, port, username, password, database)

A typical query to store data points from a device looks like this:

data = [
{
 "points":[[1422658551, 50, 400, 20], [1422658552, 45, 410, 19], ...]
 "name": device.serial
 "columns": ["time","Computer","Refrigerator","Toaster"]
}
]

db.write_points(data)

A custom query can be executed from Python as well:

result = db.query('select * from "'+device.serial+'";')[0]

Which will return the following:

[{u'points': [[1422658550, 881620001, 148.43346192136212, 380.76091986738896, 10.762135414807421],
              [1422658549, 881630001, 78.76039036524624, 416.9067253709242, 3.902769935171527],
              [1422658548, 881640001, 79.01911086519291, 284.5366280014289, 4.934237795434257]],
  u'name': u'1',
  u'columns': [u'time', u'sequence_number', u'Computer', u'Refrigerator', u'Toaster']
}]

To make this more into the form that ZoomCharts expects, a few transformations need to be made on the resulting query. It would be beneficial to not have to make these transformations and instead customize the query response, but I'm not sure if that is possible.

for point in result['points']:
    point.pop(1)
    point[0] = int(point[0])
    point.insert(1, sum(point[1:]))
appliances = result['columns'][2:]
data = result['points']

'dataLimitFrom': result['points'][len(result['points'])-1][0],
'dataLimitTo': result['points'][0][0]

This same query can be run on the graphical admin interface which will automatically produce charts for each appliance:

Image

The "schema" for this database is going to be set up as follows:

db name: seads

series: device1.serial

time appliance1 appliance2 ...
1422658550 148 380 ...
1422658549 78 416 ...
... ... ... ...

series: device2.serial

time appliance1 appliance2 ...
1422651230 128 302 ...
1422651229 92 317 ...
... ... ... ...

...

More information on InfluxDB can be found in their Documentation


I've set up the following facilities for the new database:

  • Debug datagen (influxgen)
  • Debug datadel (influxdel)
  • Charts now get their data from InfluxDB

Things yet to do:

  • Make events post to InfluxDB

In addition, I also changed how the devices are hyperlinked in the REST API. Instead of having an AutoField that generates a primary key, I assign the primary key property to the serial field of the Device model, since it is guaranteed to be unique. That way, devices can reference themselves by their serial.

1-27-15

Webserver

The flowchart for how a device initially connects to the webserver and gets registered by a user is below:

Image

I've also altered the user experience flow for the website. The following permissions on webpages exist:

Request Unauthenticated Authenticated
/ / /data/
/data/ /signin/ /data/
/signin/ /signin/ /data/
/signout/ /signin/ x
/account/ /signin/ /account/

I've locked out most of the website for Unauthenticated users. This is because we only want users who have registered to have access to the data.

The Dashboard is now more ajax-friendly, meaning all page loads are done through ajax asynchronously. This will make for a better user experience in the end since we won't want to be reloading the navbar and sidebar between requests.

The chart needs to handle unknown data better. Unknown data is wattage values coming from an uncategorized appliance. It would be nice if there were a way to always have unknown data be the bottom series on the chart, with everything else building up on it. I'll look into this tomorrow.

In addition, the chart itself should be loaded smarter. As of right now, there are two calls that need to be made to generate a chart. One for the chart structure (new Timechart()) and one for the data. These two calls should be combined into one that returns an html page with the data preloaded.

1-26-15

Webserver

Device Pairing Key

Frontend

I added a form modal on the sidebar of the dashboard for entry of a secret key to pair a new device to an authenticated user. There are a couple features that make this form special:

  • Form is generated by Django, served by ajax at /key/
  • Form uses jquery-autotab for enhanced user experience:
$(function () {
   $.autotab({ tabOnSelect: true });
   $('.digit').autotab('filter', 'number');
   $('.char').autotab('filter', 'text');
});
  • Once submitted, the form is validated against the first unpaired device with the given secret key, linking it to the user and registering it.
  • Consideration: when the key pairing modal is opened and modified, the modal should reset when it is closed.

Backend

  • A new method was added to the Device model overwriting the default save() behavior. I did this so that the secret_key field will be automatically generated and saved when the model instance is created:
def save(self, **kwargs):
      secret_key =  ''.join(random.choice(string.digits) for i in range(3))
      secret_key += ''.join(random.choice(string.ascii_uppercase) for i in range(4))
      self.secret_key = secret_key
      super(Device, self).save()
  • When the device is created, the server echoes the device properties, including the secret_key field. This can be used by the device to show the user what to input to pair with the server.

  • The server validates the secret key by querying for all unregistered devices matched with the given secret key. If there is a match, the following attributes are updated on the device instance:

device.owner = request.user
device.registered = True
  • The device query is formed as such:
devices = Device.objects.filter(registered=False, secret_key = secret_key.upper())
device = devices[0] if devices else None

1-23-15

ESP8266

A couple more configuration settings come to mind when dealing with the AP mode of the ESP. We need to have the user input identifying data for the device such as the device's location, a user-defined name (for reference), as well as come up with some way to link the device to its owner on the website.

Pairing

One the ESP side, we need to have the device generate a random code that is relatively easy for the new user to remember. This code can be generated when the device connects to the internet and relayed to the server for linking. The user should be directed to the website (seads.brabsmit.com) for now to pair the device.

Webserver

Pairing

We need to modify the API for the creation of a device to allow the device to self-register when it connects to the internet for the first time. It should populate the following fields when it does so:

  • Location (user-defined and/or automatic?)
  • User-defined name
  • Pairing code

The pairing code is what the new user will enter when adding a new device to their account. The User API should be modified to show which devices the user owns.

There also needs to be a way to lock down the Device API so that only actual devices can access it. I'll research methods that can be used, but most likely we will have an API token for all devices.

I added Pairing functionality to the Webserver today. I did so by adding a link to a modal in the sidebar that opens a dialog with the option to enter a "secret key" into a form. If the secret key matches an unpaired device, it will pair the device with the user that submitted the form. This was created with the intent that the device will automatically register itself in the database when it is first connected to the internet, waiting to be paired to a user on the system.

The workflow for pairing a device:

  1. User starts device for the first time
  2. User connects to device's setup AP
  3. User enters the following details:
  • location(?)
  • user-defined name
  • privacy setting(?)
  • Internet AP
  • Serial (hardcoded into device)
  1. Once this information is entered, the device should register with the server using a secret authentication key
  2. Upon successful registration, the server should respond with a secret pair code for the user.
  3. The device displays the secret pair code on its AP webpage, confirming registration
  4. User navigates to website, performs login/registration on Webserver, opens "Add Device" interface
  5. User is prompted for secret pair code
  6. If secret pair code matches, pair the device to the user by assigning device.owner = request.user.
  7. User refreshes dashboard to see data

1-22-15

Webserver

Side-loading events for development

I worked out two interfaces on the backend that allow for event data to be manipulated while bypassing the REST API altogether. As of right now, there are two new interfaces: datagen and datarem. Each is a webpage that generates a form for quickly loading/removing random data (pseudodata) into the database. These are necessary since, as of right now, there is no real data transmission in the project. So when I'm developing charts and data structures, I need to get a sense for what the data looks like and how the server performs.

The development pages, datagen and datadel are the two functional interfaces.

With this method, dding around 8000 data points takes a matter of seconds, rather than 8000+ seconds if going through the API. The speed limitation on the API side is not a drawback, since our devices are designed to only send sparse data.

Unoptimized Performance metrics

Random wattage data with 1 second resolution over 30 minute period:

  • 1769 data points
  • 96.7kB JSON
  • 1.64s JSON assemble/download time (too slow!)
  • 54.66 Bytes per data point
  • 97 microseconds per data point

Ideally, we want to be able to show a full day's worth of data in 5-minute intervals, so roughly 300 data points. We can use the 300 data points as a maximum, so that any way the user zooms, 300 points remain on the graph, and that is the maximum number of points ever downloaded at once.

This line of reasoning only works if we are looking at one appliance. The more appliances, the more data regardless of resolution or period size.

Data points vs. calculation time & packet size:

Single Appliance Run - No optimization 1 second resolution

Data points Size (KB) Time (s)
1 0.392 0.049
10 0.898 0.05
100 5.8 0.095
500 27.5 0.487
1000 54.8 0.896
5000 272 4.16
10000 544 7.45

Data points vs. time:

Image

Data points vs. packet size:

Image

The data generation/download is generally linear, which can result in very large wait times.

1-21-15

Webserver

The JSON for Zoomcharts for the following dataset should look like this:

Event data:

timestamp        device      appliance     wattage                                          
...
1421526900000  seads-dev1      Computer      92
1421526900000  seads-dev1       Toaster      21
1421526900000  seads-dev1  Refrigerator     393
1421526900000  seads-dev1    Television      59
...

JSON translation:

{
    appliances   : {'Computer','Toaster','Refrigerator','Television'}
    dataLimitFrom: 1421526900000,
    dataLimitTo  : 1421526900000,
    unit         : "y",
    values       : {
        ...
        {1421526900000, 566, 92, 21, 393, 59},
        ...
    }
}

Working with Pandas could turn out to be a major asset to the visualization engine if I can get it to work how I intend to. On microdata.models, the Pandas DataFrameManager was added as a field to get better access to Pandas methods built in:

from django_pandas.managers import DataFrameManager

class Event(models.Model):
    device = models.ForeignKey(Device, related_name='events')
    event_code = models.IntegerField(blank=True, null=True)
    appliance = models.ForeignKey(Appliance, related_name='appliance', blank=True, null=True)
    timestamp = models.PositiveIntegerField(help_text='13 digits, millisecond resolution')
    wattage = models.FloatField()
    
    objects = DataFrameManager()

This allows access to the following methods:

  • to_dataframe
  • to_timeseries
  • to_pivot_table

These methods can be leveraged from a view controller as such:

    from microdata.models import Event, Device

    # select device with serial=1 (seads-dev1)
    device = Device.objects.filter(serial=1)

    # get all events with filter device=seads-dev1
    events_qs = Event.objects.filter(device=device)

    # convert the queryset to a timeseries
    events_ts = events_qs.to_timeseries(index='timestamp', fieldnames=['timestamp','appliance','wattage'])

    events_ts.head()

                  appliance     wattage
    timestamp                          
    1421526482000  Computer  100.000000
    1421526483000  Computer  104.235621
    1421526483000  Computer   95.178783
    1421526801000  Computer  108.666505
    1421526802000  Computer  100.643132

The next step is to get it into the form as required by ZoomCharts.

1-20-15

ESP8266

Frontend

I made the ajax request for the wifi cgi script more robust since it is prone to failure. In the future, I will look at making the script more fail resistant, but for now this can be handled by the frontend jQuery. There are two subroutines within the GET request that perform different actions depending on the status of the GET request:

$.ajax({
    type: "GET",
    url: "/wifi/wifiscan.cgi",
})
.success(function(data) {
    // data is good to use
    ...
}
.failure(function() {
    // data is no good, try again after timeout
    setTimeout(function(), delay);
};

In this way, we ignore any errors from the GET request and simply try again. This continues until we get a successful ajax.

A useful feature of the jQuery library is the .append() method, which takes an object on the page and appends to it another object. It gets interesting when these methods become nested, such as what's used in the wifi AP table:

$("#ap-table").append($('<tr>')
    .append($('<td>')
        .append($('<div>')
            .append('<input>')
	    .append('<label>')
        )
    )
);

This code will produce the following html element:

<table id="ap-table">
    <tr>
        <td>
            <div>
                <input></input>
                <label></label>
            </div>
        </td>
    </tr>
</table>

This is very useful when generating dynamic tables with complex rows.

I made new "flat" monochrome icons to match the bootstrap theme. These icons are for the five different strengths of WiFi we look for as well as a lock icon signifying the network is password protected.

Backend

Since most of the data transfers cannot be relied upon to complete on the first request, they should either be loaded asynchronously (using $.get()) or embedded into the webpage. For loading the image icons for the wifi table, I chose to use base64 encoding in the css page. That way, if the css page transfers successfully, then so will the image embedded in the file.

Interestingly, making this change and flashing it to the ESP has no change. Meaning I can delete/modify the icons file, but the ESP will still serve it. I will look into possibly rebuilding the firmware to see if that helps.

Webserver

It's time to start thinking about how I want to store the data coming from the devices. What I know so far is packets will come in the following format:

{
  "device": "field",
  "event_code": 0,
  "appliance": "field",
  "timestamp": 0,
  "wattage": "float"
}

Where event_code and appliance are optional arguments. This packet is expected at around 1 second intervals, meaning there is very little data coming in per device. However, when this data starts stacking up with other devices, it should be scalable such that the number of devices has no effect on the query of a single device.

Pandas takes full advantage of NumPy, scikits etc. and package it. I'll start research on how this could be deployed on our configuration, and if it's even worth it over the stock sqlite3.

1-17-15

Webserver

The python script located at ShR2/Web Stack/pseudodata.py was meant to be a data generator for the visualization frontend, but ended up being something else entirely. This is because this script uses the API to send event packets to the server in much the same way a device would send packets. This is great for testing the API of the server, but for the purpose of filling the database with data for general use, this method is far too slow. Each packet takes about one second to make the round trip journey from request to response, so asking for 1000's of data points adds up to minutes and even hours of processing time.

Therefore, a method was made to create data points from the server locally, to avoid having to go through the API. I developed a custom form through Django that will serve as a debug interface to add data to the repository. In the future, this feature will be disabled for production.

Image

The form was generated by using Django's form class:

class DatagenForm(forms.Form):
    device = DeviceModelChoiceField(label='Device', queryset=Device.objects.all())
    appliances = forms.ModelMultipleChoiceField(label='Appliances', queryset=Appliance.objects.all())
    start = forms.IntegerField(label='Start')
    stop = forms.IntegerField(label='Stop')

Where DeviceModelChoiceField extends ModelChoiceField. When the user submits this form, the value of the device field becomes and instance of an object selected from the databse. So if seads-dev1 is selected, we get all the model fields of Device for seads-dev1.

In addition, ModelMultipleChoiceField returns an array of instances of the Appliance class.

These two form fields could come in handy later on when developing customization on the dashboard.

Django leverages forms in such a way that the data obtained by a POST request of a form is ready and cleaned for the server to handle:

if form.is_valid():
     device = form.cleaned_data['device']

cleaned_data is the magic function that returns the actual instance of an object for the ModelChoiceField.

1-16-15

Webserver

I spent the totality of the day creating a script that will generate random data for the dashboard chart visualization. This is the basis for the chart design in the upcoming months. If I can get some filler data for now, then that is all I need to get a skeletal visualization setup down. In the future, it will be beneficial to be working with real data so the charts can start to be designed to have greater meaning.

In addition, I switched over the default graph type to a stacked line area since that is the feedback I got during our last meeting. It is unclear as of yet if this type will be the default, but the only way to know is to play around.

pseudodata.py

The idea of this script is to produce data for the server via the API, which, abeit is very slow, is a good regression script that should be run periodically. As I write this, I realize that this should not be the primary method that I should use to populate the database with random data. An easier way would be to hook in to the database locally and insert points that way instead of having to go over Http. However, the script I wrote today will serve as a good stress test script to make sure everything is still working.

This script uses parallel processing via Parallel Python to instantiate a few clusters that divide up the workload. The script takes as command line arguments a device reference, start time, and end time. When the script loads, it prompts for some appliances, all referenced by an API hyperlink. Another prompt follows asking for an "average power" which is the seed for the random generator in the script that will give the average power a +/- 10 addition when saving.

ESP8266

Henry and I were able to successfully submit a meaningful request between the ESP module and our web server. This exersize was merely a proof-of-concept however, since Henry was issuing the commands manually via the terminal. In order to get the packets sent successfully, we had to run tcpdump on the server to analyze the packet being sent. When we first tried, we were getting status 400 (Bad Request) from the server. So we compared a packet received by the server via the Unix curl command with what the ESP would send:

Packet returns 400 (Bad Request):

22:56:42.373790 IP (tos 0x0, ttl 242, id 51, offset 0, flags [none], proto TCP (6), length 324)
128.114.59.92.39764 > 172.31.19.73.80: Flags [P.], cksum 0x3496 (correct), seq 46246:46530, ack 1113382465, win 5840, length 304
E..D.3....LJ.r;\...I.T.P....B\.AP...4...POST seads.brabsmit.com/api/event-api/ HTTP/1.1
User-Agent: ESP8266
Host: seads.brabsmit.com
Accept: */*
Content-Type: application/json
Authorization: Token 0d1e0f4b56e4772fdb440abf66da8e2c1df799c0
Content-Length: 74

{"device": "/api/device-api/2/", "wattage":"10", "timestamp":"1421296856"}

The problem with the above packet was narrowed down to the POST address: seads.brabsmit.com/api/event-api/. From what I can gather, the server was trying to route the request as a relative URI so it was translated to the following: http://seads.brabsmit.com/seads.brabsmit.com/api/event-api/. To fix this, we need to tell the server either a)the URI is absolute by adding http:// before the address, or b) correctly reference the relative API: /api/event-api. We went with option b) since the DNS of our server will be changing after development.

Packet returns 201 (OK):

22:56:42.373790 IP (tos 0x0, ttl 242, id 51, offset 0, flags [none], proto TCP (6), length 324)
128.114.59.92.39764 > 172.31.19.73.80: Flags [P.], cksum 0x3496 (correct), seq 46246:46530, ack 1113382465, win 5840, length 284
E..D.3....LJ.r;\...I.T.P....B\.AP...4...POST /api/event-api/ HTTP/1.1
User-Agent: ESP8266
Host: seads.brabsmit.com
Accept: */*
Content-Type: application/json
Authorization: Token 0d1e0f4b56e4772fdb440abf66da8e2c1df799c0
Content-Length: 74

{"device": "/api/device-api/2/", "wattage":"10", "timestamp":"1421296856"}

1-15-15

Webserver

To help aid in Henry's venture to get the ESP connected to the web server, I set up a simple API endpoint that simply returns a status 201 (OK) when invoked. I didn't have it specify the request type, so a connection by any means will echo a 201, such as POST, GET, DELETE, etc...

The endpoint is set up at /echo/.

1-14-15

ESP8266

Frontend

I measured the speed of a content download from the server this morning, topping out at 40kB/s. This means that any element that is larger than 100kB will take about 3 seconds to load. With this in mind, I began streamlining our libraries for the ESP module, including the bare essentials. Bootstrap and jQuery could stay, but I ended up recompiling them with just the modules I was using.

For the jQuery build, I used This Website and built a version with only ajax.

For the Bootstrap build, I used This Website to customize the bootstrap. The build consisted of the following items:

Image

This reduced both libraries by about 30%, which saves a total of about 1 second on page load, which is a lot.

In addition to streamlining the libraries, I'm loading them through asynchronous ajax requests. This removes the blocking the download would have if it were included as a standard js plugin:

<!-- Blocking download -->
<script src="/static/js/jquery.min.js"></script>

<!-- Non-blocking download -->
<script>
    $.getScript("/static/js/jquery.min.js");
</script>

The downside to this strategy is that the functions contained in the libraries are unavailable until the library is loaded, so the scrolling features and navbar links are non-operational until they download. I thought this was a reasonable tradeoff to get the page to load a little quicker and get some quick content up on the page for the user.

Later on, I removed all dependencies on the 140medley.js library such as the xhr() function and instead switched over to jQuery for all ajax requests. I found the xhr() function to not be 100% reliable. The way to execute cgi scripts are as follows:

$.get(url, function ( data ) {
    do_something(data);
});

Backend

On the "backend" of our ESP module, there is a ton of C code that runs which generates the cgi scripts and templates that are served when a page is loaded. One such script, cgiwifi, is the routine that actually queries the device for networks in its area. This is then served back to the webpage in the form of a json that gets interpreted by the javascript on the page.

Since I've moved the functionality of the wifi scan from its own page onto the front page, I needed to move the cgi script to the front page:

// user_main.c

HttpdBuiltInUrl builtInUrls[]={
    ...
    {"/index.tpl", cgiEspFsTemplate, tplSetupPage},
    ...
}

The builtInUrls array is a url->function map that translates web requests into function calls. The first argument is the url, the second is the template code to be run which returns context for the page, and the third is a cgi script. In this script, the routines carried out in cgiwifi.c had to be moved to cgi.c where the tplSetupPage function now lies.

For future development, tplSetupPage is a great springboard for other functions and pages that require system calls and context.

1-13-15

ESP8266

The FTDI device that is the UART-USB bridge I am using to connect is still operational. I know this because I can connect to it via my machine and read its output. This means the problem lies somewhere in the device handoff for the virtual machine. Which is odd, because for the longest time, the device was able to connect seamlessly. I've reinstalled a fresh lubuntu distribution and am running Henry's toolchain builder on it to see if I can get anything from a fresh install.

What I'd like to do when I can compile to the ESP:

  • Fix the signal strength glyph icons to show actual strength
  • Slim down the interface to exactly what we want
  • Add a custom cgi script to play with the functionality
  • Fill in the gaps on the website: add a tutorial

I'd also like to talk with Henry about getting the ESP to start sending data once it's connected to an access point. I would feel much better if we actually had the ESP talking to the web server, since that is the ultimate goal. If we can get a primitive form of that, then everything else will come naturally.

Problem solved! I'm not exactly sure what caused the virtual box to pick up the USB again, but it's working again. After reinstalling the toolchain and testing the make, it just so happened to connect to the FTDI and begin flashing out of the blue. Perhaps it was working all along on the new virtual machine, I had no way of knowing since screen was not installed, which I did not know until I tried to do a read of the USB.

I started playing around with how we can speed up the responsiveness of the webpages served by the ESP. We should be able to use AJAX through jQuery to load new html content on one page. When we do this, it takes out the awkwardness of having to reload scripts and css such as big files like Bootstrap and jQuery when we load a new page. That way, our homepage acts as a "template" and will have a container that changes content based on user interaction:

    <script>
    function wifi() {
        $.ajax({url:"/wifi",success:function(result) {
            $("#container").html(result);
            }})
    }
    </script>

This will load the wifi page we use to scan and connect to a router. In the future, this function can be slimmed down to a $.load and it could be triggered by a button's click event.

So the Ajax request goes through and it loads the new content, but for some reason the javascript that should accompany it is not functioning. I will look into this after class.

After more deliberation, I decided to nix the ajax loading for now. The new concept is to have a single page application that takes the user through the set up process. There's a bootstrap theme that I applied on the ESP this evening that simulates pagination through scrolling. If each "page" is looked at like a step in the setup process, then everything would flow nicely.

If this page gets too big and takes a while to load, I may have the page generate piecewise so there is at least something displayed while the rest is being loaded. The sections that haven't come in yet can be filled with loading bars or something.

Webserver

I took care of some administrative items this morning with the webserver. In the footer, I added logos for UCSC, Baskin, and CITRIS. Two of these were not transparent so I added transparency. I also expanded the size of the graph in preparation for its transformation from bar graph to stacked line graph. Not too exciting.

1-12-15

ESP8266

I was able to add the bootstrap css library to the esp webserver. After playing around with the flash memory, it seems there is somewhere between 300kb and 450kb of memory to work with. This means we shouldn't have any pictures on the esp, since we are already broaching 300kb with Bootstrap and jQuery alone.

Due to the timing of the esp (aka how slow it is), javascript should be loaded in the head as opposed to the bottom of the document. This way we are sure it is loaded before the javascript/html want to run.

Preliminary measurements put the esp power between 50 and 100 mA. Power is not a concern for us, but I thought it would be interesting to know.

Since we like pictures:

Obligatory "Hello world" from the ESP Image

Functional interface to scan WiFi networks and configure the ESP to connect to one Image

For tomorrow: Henry is bringing in new (FCC certified) ESP8266-03 chips that we will start to develop primarily for. This will change the game slightly, but we should be able to use our already existing source and cross compile fairly easily.

Webserver

I tried grappling with more automation features with our webserver. I am at the point now where I can get git to send a POST command to an API endpoint /gitupdate/ on the server that will run a small script to essentially call git pull. The end result is something close to full automation, but a problem still lies in the fact that some of the time, server code will change that causes the uWSGI service to require a reboot before changes take effect.

This is different than just calling git pull. That command works so easily because I set up permissions on the web root for www-data, our nginx user for the web server. To restart uWSGI, the following command must be run:

sudo service uwsgi restart

The problem lies in that this command requires sudo. Right off the bat, a couple concerns arise. Do I want a command that requires superuser priviledges to run from an API endpoint? Is it even possible to automate a sudo command? These questions I'll try to answer tomorrow.

1-9-15

Webserver

Zoomcharts

Today the ticket was resolved. Turns out I had downloaded an older version of Zoomcharts somehow when converting to a free license. I was able to download the newest version from their website. After testing it out, it seems to work even better than before.

Github's Webhook Service

After getting the Web Stack code pushed to our repository, I had to come up with a way to get it deployed to the webserver automatically. If I could accomplish this, then any changes to the repository would be automatically deployed to our server, meaning I did not have to log in to the web server to do anything.

First thing that had to be done was get uWSGI redirected to point at the repository instead of the local code located at /home/ubuntu/seads. In the home directory, I created a new folder seads-git where I cloned the repository into. From here, I had to edit /home/ubuntu/uwsgi.ini and update the following line:

chdir=/home/ubuntu/seads-git/ShR2/Web Stack

This will make our new web root here.

There is extensive documentation on Github's website that shows a way to send a POST command to an endpoint on a certain event. We can have the repository send a POST command to the server whenever someone pushes a change to the code.

Some changes were made to the Django REST framework to create the endpoint we would need. In seads/home/views.py, a routine was added:

@csrf_exempt
def gitupdate(request):
    if request.method == 'POST':
        g = git.cmd.Git("/home/ubuntu/seads-git/ShR2/")
        g.pull()
        return HttpResponse(status=201)
    else: return HttpResponse(status=403)

CSRF exemption is necessary since this POST request is coming from outside the domain.

The url endpoint was created by making modifications to the seads/seads/urls.py module:

...
url(r'^gitupdate/$', 'home.views.gitupdate'),
...

NOTE: The webhook url in github needs to CHANGE when we eventually migrate domains. It is set to seads.brabsmit.com for now.

ESP8266

Further testing this morning, I was able to easily recreate the wireless network as seen last night. Once the firmware is flashed, the device is ready for the web server content. This is a .espfs file generated by make htmlflash in the esphttpd directory. This command will srape the html folder in this directory and compile the content into an image. It will then upload it to the device. When this finishes, the network "ESP_98A8D2" will form and can be connected to.

If you connect to this network, visiting http://192.168.4.1/ will bring up a simple webpage with a link to get the ESP connected to a router. I've already configured my ESP to connect to "seads" which is our in-house WAP. From "seads", I can successfully ping the device (500ms response time??).

The html directory can be modified to our heart's content. I'll experiment with adding a few "web" features such as jQuery and Bootstrap to see how much this device can handle. As for right now, it seems pretty functional as is.

If changes are made to the html directory, restarting the device and running make htmlflash again should compile and upload the new content.


1-8-15

ESP8266

I installed the toolchain for the ESP8266 this morning by following the Toolchain Wiki Tutorial. After I did this, Henry pushed a script to the repository that essentially does the whole process automatically. My manual process seems to work the same, minus one change. We need to get the esptool program into the bin directory of the user so it can be executed from wherever:

sudo ln -s /opt/Espressif/esptool-py/esptool.py /usr/local/bin

Henry found a forum post today that looks to be a complex version of a web server that we could adapt to use with our application. It seems to become an access point if no network configuration is set, and a web server otherwise.

A couple changes had to be made in order to get the cloned repository to work properly:

  1. Heatshrink had to be cloned to this /opt/Esressif/source-code-examples/esphttpd/lib/heatshrink/
  2. It seems as though the Makefile was using the syntax of an esptool that is deprecated. A few modifications had to be made to this file:

Header changes:

...
#Esptool.py path and port
ESPTOOL     ?= esptool.py
...

Changes to flash routine:

...
flash: $(FW_FILE_1) $(FW_FILE_2)
    -$(ESPTOOL) --port $(ESPPORT) write_flash 0x00000 firmware/0x00000.bin 0x40000 firmware/0x40000.bin
...

Changes to htmlflash routine:

...
htmlflash: webpages.espfs
    if [ $$(stat -c '%s' webpages.espfs) -gt $$(( 0x2E000 )) ]; then echo "webpages.espfs too big!"; false; fi
    $(ESPTOOL) --port $(ESPPORT) write_flash 0x12000 webpages.espfs
...

With these changes made, I can now follow the instructions in the tutorial to get the ESP8266 to run the web server code, which is the next step in the process.

I got the web server code to seemingly work, I'll have to explore it more tomorrow. Here is the process to get the esphttpd code uploaded and configured:

sudo make flash
# reset ESP8266
sudo make htmlflash
# disconnect GPIO0 (floating)

After doing the above, a new wireless network should form, "ESPxxxx"

Connect to this network and go to http://192.168.4.1 to play around with the web server.


1-7-15

Webserver

Zoomcharts

Today I got the Zoomcharts license converted from trial to free, which brought up issues with how the chart now looks. I think it's a bug, so I've filed a ticket with their system. Chart development is halted until this resolves. If it doesn't resolve in a reasonable amount of time, I'll have to decide on a new charting library.

ESP8266

I followed the instructions on This Page to get the ESP connected to a serial-usb adapter. Image

I dove into the Forum to start looking at ways to get this thing to do what we want. I'll look into using the "official" GCC compiler in the image provided by this post and see where that leads. The virtual machine is up and running, need to install 32 bit eclipse.

(1-8-15)- lubuntu credentials (since I keep forgetting):

  • username: esp8266
  • password: espressif

Weekly Reports

2-7-15 to 2-14-15

  • Webserver

  • Developed methods for real-time data on the graphs

  • Added Google Maps API to website

  • Autocomplete places search

  • Added ability to pinpoint user on map and place a marker

  • Started added functionality for user to pinpoint location of device

  • Added method to clean up events on device delete

  • ESP8266

  • Worked out device pairing workflow with Henry - Device determines the code, sends to server

  • Miscellaneous

  • Went over Teddy's thesis (day-long detour)

2-2-15 to 2-6-15

  • Webserver

1-25-15 - 1-31-15

  • Webserver

    • Added device pairing algorithm
    • New device registration refined: added pairing key generation
    • Restricted access to most webpages: user must be authenticated
    • Changed database for storing events:
      • InfluxDB replaces sqlite3 for charting, sqlite3 remains the db for everything else
      • Rewrote sideloading into InfluxDB
      • Started data smoothing by using aggregate queries (should not be done on the fly)
    • Made dashboard load more logical: initial load brings up a new graph with default settings, subsequent loads only load new chart data (json)

1-18-15 to 1-24-15

  • ESP8266

    • Improved ajax requests for the AP webserver
    • Added nesting appends to the WiFi selection form
    • Encoded images with base64 to streamline data transfer
    • Need to add user customization to device AP interface
  • Webserver

    • Started exploring Pandas for Timeseries data manipulation. Will revisit next week.
    • Data download requests are clocking in at linear time, far too slow for mass queries.
    • Added debug modules to side-load data into the database for testing
    • Created a method on the server to pair a device to a user using secret key API
    • Need to work out a way to get the secret key from the server to the device

1-11-15 to 1-17-15

  • ESP8266

    • Loaded Bootstrap and jQuery onto the httpd server, removed 140medley
    • Pages tend to load slowly, so most resource downloads are made through asynchronous ajax requests
    • Changed the layout of the httpd website by having the entire workflow on a single page, with each section loading asynchronously.
    • Was able to modify the cgi scripts in the C code to get custom responses from the server
    • Worked with Henry to get packets sent from the ESP to the web server successfully (documentation)
  • Webserver

    • Decided against having sudo service uwsgi restart command run from the github webhook for now.
    • Added logos to the footer of the website
    • Set up echo endpoint for initial http testing with the ESP
    • Created regression script for the API
    • Created data generator on the server for charting.

1-4-15 to 1-10-15

  • ESP8266 (25 hours)
    • Began working with this WiFi module. I needed to download a lubuntu distribution and install the toolchain to get flashing.
    • Several changes had to be made to the toolchain to get it to work properly. The makefile commands needed to be updated and paths needed to be changed. These are changes to my specific build, Henry reportedly did not need to change his Makefile.
    • The process to flash the memory and upload the web pages are set forth as follows:
      1. Flash firmware make flash
      2. Reset ESP
      3. Flash webpages make htmlflash
      4. If broadcasting, done
      5. If not broadcasting, disconnect GPIO0, wait for device to reset
      6. Flash webpages make htmlflash
      7. If broadcasting, done
      8. If not broadcasting, goto 5
    • The ESP is at a place right now where it can create an access point, scan for access points, and allow the user to specify which access point to connect to. After it's connected, we can ping it from the access point it connected to. Howerver, the ping latency is 500ms, which could be a problem later.
  • Webserver (15 hours)
    • I experienced a bug with Zoomcharts (our charting library) and had to create a support ticket. Chart development was delayed 2 days due to this. The bug was fixed and everything is back on track.
    • GitHub integration - up to this point, the webserver code was modified locally with no version control. I took the time to upload the code to the team's repository and set up a home for it on the server. This involved changing the web's root directory through uwsgi.
    • GitHub's webhook service - I took the time to set up an API endpoint with the server that hooks into GitHub to pull the newest version of the repository whenever it changes. This allows developers to make changes to the web code on their local machine and see the changes reflected immediately after it is pushed so no one needs to log in to the webserver anymore. The only caveat is, at this point, if the code changes something that requires a uwsgi reboot, it must be done manually.

Weekly Meeting Notes

1-12-15

  • Our aim is to get a device out on the field within the next couple weeks. We want to give Pat something to look at as well as have something out in the wild that we can play around with.

    • We need to merge the AP project with Henry's UART to HTTP project so we have the full stack on the ESP.
    • In addition, I need to polish up the AP setup project so that we can accept a device name and location, as well as a pairing key once it's connected.
  • The stacked line area chart was well received, only complaint being the red line that is on top of every series. Suggested to get rid of it or at least make it a different color.

1-12-15

  • Administrative: weekly reports are due Sunday nights. They should cover what happened individually over the course of a week, what was completed on the gantt chart (if anything) along with hours worked.
    • The GitHub Wiki is a great resource for documentation, but teaching staff recommends having an engineering notepad for more personal notes and/or calculations on the fly. Not sure how much mileage I would get out of one as I am mainly software.
  • Website: generally positive first glance, but most of it is still just a shell for what's to come. There was a lot of feedback on the charting utility, which is great, since it is still in its fledging stages and can be changed fairly easily.
    • Charting: move away from the bar chart, preference is stacked area line chart. This gives the overall impression of disagregated data and how each appliance is independent of each other.
    • Perhaps make the chart more customizable by having a couple different default layouts to choose from.
    • We want to be able to take the stacked area and blow it up so that each device can be looked at separately.
    • The idea about having an appliance be plucked out over a range and inspected is potentially a great feature. If I could develop statistical queries against a multitude of devices, this could be a money saving feature.
  • ESP8266: no feedback was given here, I think we are on the right track and should keep going the way we are.