shepherddog.io

Presented By: Zachary, Julian, Xehu, & Steven

December 6, 2018

woof.

woof.

woof.

What does shepherddog do?

  • Scrapes exisiting Babson course cataloug and converts into a RESTful API
  • Allows the user to create a full course schedule with a React.js powered calendar and smart table
The Power of Python
Average 1 - 3 seconds... p.s. this is live

How does this work?

Users Goes To Site

Django Calls Course Fusion

Scrapes & Converts to API

React.js Reads JSON

{
course_section: "01",
class_room: " ",
key: "ACC1000-01-8:00AM-9:35AM-M-38Spring2018 ",
prof_last_name: "Blanchette-Proulx",
prof_full_name: "Shay Blanchette-Proulx",
day_of_week: "MW",
course_code_raw: "ACC1000",
course_code: "ACC1000-01",
spots_taken: 13,
class_name: "Introduction To Financial Accounting",
spots_available: 38,
time: "8:00AM-9:35AM",
spots_filled: "13 of 38",
credits: "4.00",
prof_first_name: "Shay",
session: "Full Session",
semester: "Spring2018"
},

core Technologies

Django

  • Python Web framework
  • 'Batteries Included'

Facebook's React.js

  • Component Based Frontend JS Framework

Heroku

  • Dev 'friendly' Platform as a service

Ant Design

  • React.js frontend design framework

CORE COURSE PRINCIPLES

JSX

  • Virtual DOM Technology
  • HTML in JS 😲
  • ES6 Syntax

HTTP Calls & Status Codes

  • HTTP vs HTTPS calls in the context of a web application
  • HTTP Status Codes
import React from 'react';

export default (props) => {
  return (
    <h1>{props.title}</h1>
    );
};
1xx Informational responses.
2xx Success.
3xx Redirection.
4xx Client errors.
5xx Server errors.

CORE COURSE PRINCIPLES

JSON Files (Read & Write)

  • Array of objects
  • Key Value Pairing
  • Serialization
  • Reading through JSON
  • Sending files through JSON
{
    "glossary": {
        "title": "example glossary",
		"GlossDiv": {
            "title": "S",
			"GlossList": {
                "GlossEntry": {
                    "ID": "SGML",
					"SortAs": "SGML",
					"GlossTerm": "Standard Generalized Markup Language",
					"Acronym": "SGML",
					"Abbrev": "ISO 8879:1986",
					"GlossDef": {
                        "para": "A meta-markup language, used to create markup languages such as DocBook.",
						"GlossSeeAlso": ["GML", "XML"]
                    },
					"GlossSee": "markup"
                }
            }
        }
    }
}

Django: end points

https://shepherddog.io/app/classes
@xframe_options_exempt
def react_view(request):
    return render(request, './app/react/index.html')
https://shepherddog.io/api/v1/classes

@xframe_options_exempt
def render_classes_json(request):
    classes = get_classes()
    return JsonResponse(classes, safe=False)

Allows view to be seen in iframe

Django: get_classes()

import lxml.html as LH
import requests
from ShepherdDogUser.models import random_id
def get_classes(search_term):
    """
    This function goes into the Babson Course Listing
    Takes out all of the courses and exports them all
    as objects in one giant array
    """
    all_classes = [] # Will contain all of the class objects
    xpath1 = "*"
    #  Get initial unprotected Babson Course Log
    url = 'https://fusionmx.babson.edu/CourseListing/index.cfm?fuseaction=CourseListing.DisplayCourseListing&blnShowHeader=false&program=Undergraduate&semester=All&sort_by=course_number&btnSubmit=Display+Courses'
    page = requests.get(url)
    # Turn page html string into lxml HTML object
    html = LH.fromstring(page.text)
    # Iterate over each element
    data = html.xpath(xpath1)
    # Get the body element object
    body = data[1]
    # This next section is the process of iteratating and drilling down into the td elements
    tables = body.xpath('table')
    tr_elements = [] # This wilfl house all of our final <tr /> HTML objects
    for table in tables:
        tr_elements = []
        tr = table.xpath('tr')
        semester = table.xpath('tr/td/*/tr')[0].xpath('*/text()')
        all_trs = tr[0].xpath('td')[0].xpath('table')[0].xpath('tr')
        for tr in all_trs:
            for div in tr.xpath('td')[0].xpath('div'):
                for tr in div.xpath('*')[0].xpath('*'):
                    tr_elements.append(tr)
    
        # Furthering drilling down into each <tr /> element to get to the tds
        for tr in tr_elements:
            class_object = {}
            td_array = []
            for td in tr.xpath('td'):
                count = 0
                course_name = td.xpath("a")
                # Check if it is an <a /> tag or not
                if len(course_name) == 1:
                    course_name = course_name[0].xpath("text()")[0]
                    td_array.append(course_name)
                else:
                    # If it is not just place in the text from the <td />
                    try:
                        course_data = td.xpath("text()")
                        
                        td_array.append(course_data[0])
                    except:
                        pass
            # Because the data is not in a dependable array we can start to assign 
            # class object attributs from the tr's td data points
            try:
                class_object['class_name'] = td_array[2].title()
                class_object['course_code'] = td_array[1]
                
                # Assign the different array locations to each of the key
                if '-' in td_array[3]:
                    class_object['day_of_week'] = "".join(" ".join(td_array[3].split(' ')[0:2]).split())
                    class_object['time'] = "".join(" ".join(td_array[3].split(' ')[2:]).split())
                    prof_place = 4
                    class_room_place = 5
                    spots_filled_place = 6
                    credits_place = 7
                    session_place = 8
                else:
                    prof_place = 3
                    class_room_place = 4
                    spots_filled_place = 5
                    credits_place = 6
                    session_place = 7

                class_object['class_room'] = td_array[class_room_place]
                class_object['session'] = td_array[session_place]
                class_object['semester'] = "".join(semester[0].split())
                class_object['credits'] = td_array[credits_place]
                class_object['spots_filled'] = td_array[spots_filled_place]
                class_object['prof_last_name'] = "".join(",".join(td_array[prof_place].split(',')[0:1]).split())
                spots_filled = td_array[spots_filled_place]
                class_object['spots_taken'] =  int("".join("of".join(spots_filled.split('of')[0:1]).split()))
                class_object['spots_available'] =  int("".join("of".join(spots_filled.split('of')[1:]).split()))

                class_object['course_code_raw'] =  "".join("-".join(td_array[1] .split('-')[0:1]).split())
                class_object['course_section'] = "".join("-".join(td_array[1] .split('-')[1:]).split())
                class_object['key'] = td_array[1] + "-" + "".join(" ".join(td_array[3].split(' ')[2:]).split()) + "-" + "".join(" ".join(td_array[prof_place].split(' ')[2:]).split()) + "-" + "".join(" ".join(td_array[spots_filled_place].split(' ')[2:]).split()) + "".join(semester[0].split()) + td_array[class_room_place]
                first_name = ",".join(td_array[prof_place].split(',')[1:])
                # Remove Middle Initial
                if " " in first_name:
                    first_name = "".join(" ".join(first_name.split(' ')[0:2]).split())
                class_object['prof_first_name'] = first_name
                class_object['prof_full_name'] = first_name + " " + "".join(",".join(td_array[prof_place].split(',')[0:1]).split())

                # class_object['prof'] = td_array[4]
            except:
                pass
            
            try:
                if class_object['class_name'] != 'Title' and class_object['prof_first_name'] != 'CANCEL':
                    all_classes.append(class_object)
            except:
                pass
    return all_classes

    

def get_class_description(semester, year, course, section):
    """
    This function goes into the Babson Course Listing Detail
    And returns the class description
    """
    all_classes = [] # Will contain all of the class objects
    xpath1 = "*"
    #  Get initial unprotected Babson Course Log
    url = 'https://fusionmx.babson.edu/CourseListing/index.cfm?fuseaction=CourseListing.DisplayCourseDetail&semester='+semester+'%20'+year+'&course_number='+course+'&course_section=' + section
    page = requests.get(url)
    print url
    # Turn page html string into lxml HTML object
    html = LH.fromstring(page.text)
    # Iterate over each element
    data = html.xpath(xpath1)
    print data
    # Get the body element object
    description = data[1].xpath('*')[0].xpath('*')[0].xpath('*')[0].xpath('*')[0].xpath('*')[-1].xpath('*')[1].xpath('text()')

    return description

Django: json object

{
    course_section: "01",
    class_room: " ",
    key: "ACC1000-01-8:00AM-9:35AM-M-38Spring2018 ",
    prof_last_name: "Blanchette-Proulx",
    prof_full_name: "Shay Blanchette-Proulx",
    day_of_week: "MW",
    course_code_raw: "ACC1000",
    course_code: "ACC1000-01",
    spots_taken: 13,
    class_name: "Introduction To Financial Accounting",
    spots_available: 38,
    time: "8:00AM-9:35AM",
    spots_filled: "13 of 38",
    credits: "4.00",
    prof_first_name: "Shay",
    session: "Full Session",
    semester: "Spring2018"
},

Django: the key issue

class_object['key'] = td_array[1] + "-" + "".join(" ".join(td_array[3].split(' ')[2:]).split()) + "-" + "".join(" ".join(td_array[prof_place].split(' ')[2:]).split()) + "-" + "".join(" ".join(td_array[spots_filled_place].split(' ')[2:]).split()) + "".join(semester[0].split()) + td_array[class_room_place]
  • We faced a challenge with creating a unique key for every class
  • We could NOT do a randomly generated number as that would ruin the saved links

React: Higher View

React: Create calendar events

{
    course_section: "01",
    class_room: " ",
    key: "ACC1000-01-8:00AM-9:35AM-M-38Spring2018 ",
    prof_last_name: "Blanchette-Proulx",
    prof_full_name: "Shay Blanchette-Proulx",
    day_of_week: "MW",
    course_code_raw: "ACC1000",
    course_code: "ACC1000-01",
    spots_taken: 13,
    class_name: "Introduction To Financial Accounting",
    spots_available: 38,
    time: "8:00AM-9:35AM",
    spots_filled: "13 of 38",
    credits: "4.00",
    prof_first_name: "Shay",
    session: "Full Session",
    semester: "Spring2018"
},

[Array of selected keys]


renderStartTime(time, day_of_week, start) {
  // console.log(day_of_week)
 
  var dayINeed = 0;
  if (day_of_week == 'M') {
    dayINeed = 1;
  } else if (day_of_week == 'T') {
    dayINeed = 2;
  } else if (day_of_week == 'W') {
    dayINeed = 3;
  } else if (day_of_week == 'R') {
    dayINeed = 4;
  } else if (day_of_week == 'F') {
    dayINeed = 5;
  } else if (day_of_week == 'S') {
    dayINeed = 6;
  } 
  if (start) {
    var realTime =  time.substr(0, time.indexOf('-'));
    var amOrPm = realTime.slice(-2);
  
    var realHour =  parseInt(realTime.split(':')[0]);
    var realMinute = parseInt(realTime.split(':')[1]);
    if (amOrPm == 'PM') {
      // console.log(realHour)
      realHour = realHour + 12
    } 
  } else {
    var realTime =  time.split('-')[1].replace(/\s+/g, '')
    var amOrPm = realTime.slice(-2);

    var realHour =  parseInt(realTime.split(':')[0]);
    var realMinute = parseInt(realTime.split(':')[1]);
    if (amOrPm == 'PM') {
      realHour = realHour + 12
    } 
  }
 

   var date =  moment().isoWeekday(dayINeed).toDate()
  //  console.log(date)
    date.set({
      'hour' : realHour,
      'minute'  : realMinute, 
      'second' : 10
   });
    return date


}

 // for Thursday

renderCourseEvents() {
  Date.prototype.addHours = function (h) {
    this.setHours(this.getHours() + h);
    return this;
  }

  
  var courseEvents = [];
    this
      .state
      .selectedTableData
      .map((course) => {
        // var chars = course.day_of_week.split("")
        if (course.day_of_week) {
          var days = course.day_of_week.split(/(?!$)/)
          var arrayLength = days.length;
          for (var i = 0; i < arrayLength; i++) {

            const courseObject = {
              key: course.key,
              course_code: course.course_code,
              title: course.class_name,
              day: days[i],
              hexColor: this.renderDayofWeekColor(course.day_of_week, true),
              start: this.renderStartTime(course.time, days[i], true),
              end: this.renderStartTime(course.time, days[i], false),
              location: course.class_room 
            };
            courseEvents.push(courseObject);

          }

        }

      });
    return courseEvents

}

React: url pattern

shepherddog.io/app/classes?selected_classes=ACC1000-01-8:00AM-9:35AM-M-38Spring2018 ,ACC1000-03-11:30AM-1:05PM-M-38Spring2018
{
    course_section: "01",
    class_room: " ",
    key: "ACC1000-01-8:00AM-9:35AM-M-38Spring2018 ",
    prof_last_name: "Blanchette-Proulx",
    prof_full_name: "Shay Blanchette-Proulx",
    day_of_week: "MW",
    course_code_raw: "ACC1000",
    course_code: "ACC1000-01",
    spots_taken: 13,
    class_name: "Introduction To Financial Accounting",
    spots_available: 38,
    time: "8:00AM-9:35AM",
    spots_filled: "13 of 38",
    credits: "4.00",
    prof_first_name: "Shay",
    session: "Full Session",
    semester: "Spring2018"
},
{
    course_section: "01",
    class_room: " ",
    key: "ACC1000-01-8:00AM-9:35AM-M-38Spring2018 ",
    prof_last_name: "Blanchette-Proulx",
    prof_full_name: "Shay Blanchette-Proulx",
    day_of_week: "MW",
    course_code_raw: "ACC1000",
    course_code: "ACC1000-01",
    spots_taken: 13,
    class_name: "Introduction To Financial Accounting",
    spots_available: 38,
    time: "8:00AM-9:35AM",
    spots_filled: "13 of 38",
    credits: "4.00",
    prof_first_name: "Shay",
    session: "Full Session",
    semester: "Spring2018"
},

future expansion & ACQUISITION

  • Add in Class Track functionality
    • Scrape every 1 min, not 10 min
  • Add in a A/B calendar option
    • Color code for different semesters
  • Save schedule via a login and phone number
    • Link with their Babson login
  • Course Insights & Graduation Progress?

Questions?

woof.

woof.

woof.

Shepherddog.io

By Zachary Bedrosian

Shepherddog.io

  • 714