Categorization

Training Data

Overview

  • Accuracy, Precision, Recall Calculation Fix
  • Training Data Format
  • Knowledge Graph Use

Improving Results

On the same data set as last week.

Calculations on Training Data

New input format

Calculations on Training Data

Frequency counter tool automatically gives notated data now

Calculations on Training Data

Gathering Other Seed Data

Decided on other categories for seed data

  1. Travel (Vacation Cities)
  2. Sports (Teams)
  3. 3C Products (Brands)
  4. Politics (US Politicians)
  5. Music (Song Titles)
  6. Clothing / Adornments (Brands)
  7. Arts & Culture (Video Games)
  8. TV & Entertainment (TV Shows)

Using Knowledge Graph

  • To be done after gathering the data for verification
  • Can be checked whether a query is a restaurant or not.
  • Offers many types - Standardized "Schema"

Query -->

Result 

     |

    V

Focus on the @type variety

Consider Restaurants

  • We consider Restaurants to be the blanket terms which will encompass the lists.
  • In reality, Schema (schema.org) considers it as a specific subsection.

Multiple Inheritance

Each entity derives its properties from different ancestors, and the properties of a particular category are the net sum of its ancestors + any of its own

Inherited Properties

Another Example

7.7

By katiec089

7.7

  • 296