Assessing professional skills in a multi-scale environment

by means of

graph-based algorithms

Jose María Alvarez-Rodríguez and Ricardo Colomo-Palacios 

The First European Network Intelligence Conference,

September 29-30, 2014 - Wroclaw (Poland)

The Problem...

Can someone endorse both "Sales" and "Java"?

Why/When the skill is endorsed?

Can we trust in such skills?

How skills are spread?

Can we align skills to an existing

expertise scale or ranking?

...

...

What is the meaning of "30" in "Semantic Web"?

...multiple expertise scales...

“Professional Development Ladder”, by Contrux Software, 4-scale classification of expertise

The Individual Competency Index (ICI)

The ICI index.

A 5-scale competency index.

...in other words...

Skill explosion

  • Endorsements of multiple skills
  • ...from different profiles to others
  • ...triggered by:
    • Professional/Personal relationships
    • Professional/Personal interests
    • Professional/Personal education
    • User's activity in the social network
      • The most active user is the most endorsements she receives...
    • ...

How to...

Establish a reliable quantitative value for every endorsed skill...

Align quantified skills to an existing scale?

Which scale should be selected? 4 or 5 level?

Potential Impact

Expertise ranking and retrieval

Human Resources management

Organizational

knowledge capitalization

Currently on the web...

Meritocracy...

A lot of related work...

Advantages

  • Widely studied and well-known techniques
    • Graph-based
  • Focus on expertise retrieval
    • Moving from document to expertise  retrieval
  • Based on collaborative tagging
  • ...

Drawbacks

  • Few studies of existing professional social networks
  • No alignment to existing professional skills
  • Lack of attention to user's activity
  • Quantitative and qualitative mappings are not completely covered
  • ...

Main Contributions...

  • Application of existing graph-based techniques to rank skills
    • The HITS and SPEAR algorithms
  • Adaptation of the SPEAR algorithm to professional social networks
    • The Skillrank technique
    • Study of user's activity and interactions
  • Mapping to existing professional scales
    • 4/5 scale under study to avoid central tendency
  • Case study of a restricted dataset generated from the Linkedin API

The HITS algorithm

Background

"-A good hub increases the authority weight of the pages it points. 
-A good authority increases the hub weight of the pages that point to it."

The SPEAR Algorithm

Background

 

  • Mutual reinforcement of user expertise and document quality (collaborative tagging)
  • Discoverers vs. followers

The Skillrank technique

An interpretation of

the SPEAR algorithm...

tagging users

instead of documents

A social network as a

community C...

...but a community is also the union of several sub-communities

2 Contexts: Local (MyLinkedin) vs Global (Linkedin).

...and user's activity

(local context)

Correlated-endorsements to the same user  in a sub-community.

Local Context (II)

Correlated-endorsements to different user in a sub-community

Independent endorsements to the same user   in a sub-community.

Global Context

Correlated-endorsements to different user in a community

Independent endorsements to different user in a community

The Skillrank technique

Re-interpreting the SPEAR algorithm

  • The vector of expertise and quality scores

...naming change?

Document/User

 

  • Two-step process
    • From a local expertise assessment to a global one

"0 represents that a user (Um)  has not yet endorsed using a skill (Sk)  while another value such 2 in cell represents that an user endorsed an skill before another user."

Case Study

Linkedin

A Professional Social Network

Research Design

  1. Select techniques to make a comparison of the precision

    1. The HITS, SPEAR and Skillrank techniques

  2. Select and prepare dataset.

    • 10 users (average of 50 connections) and 5 skills

  3. Create a dataset for unit testing purposes

    • Panel of experts has established the different levels in both scales (4 and 5)

  4. Inclusion of the frequency as a basic technique for each user and skill

  5. Run the techniques for every dataset and technique

  6. Calculate the percentiles (quartiles and quintiles) for every user, skill and technique to align to scales.

    1. Extract precision and compare

Request to Linkedin

https://api.linkedin.com/v1/people/~/connections:(id,first-name,last-name,formatted-name,
email-address,headline,industry,location,num-connections,summary,specialties,positions,
site-standard-profile-request,public-profile-url,api-standard-profile-request,proposal-
comments,associations,honors,interests,publications,patents,languages,skills:
(id,skill,proficiency,years),certifications,educations,courses,volunteer,
three-current-positions,num-recommenders,following,job-bookmarks,date-of-birth,
member-url-resources,connections)
My network (+450 connections) was asked to execute this command and save 
the XML output (20).

Skills selection

Technical and business ones

ICI Index

Professional Development Ladder

Frequency as

basic measure

Number of endorsements that have received an user in a certain skill 


Number of maximum endorsements that can receive that user in that skill

% of correct classified skill for every

skill and technique

Graphical view of the % of correct classified skills per technique and type of scale

Research Limitations

  • The Linkedin API restricts the visibility of some relevant fields if you are not 1st degree connection.

  • The dataset is restricted to a very few number of skills and users.

  • The mapping to the professional scales should be done by a wider panel of experts.

  • The average is not the most adequate aggregation operator.

  • A full test campaign should be executed to ensure the parameters of the different techniques.

  • ...

Conclusions

  • Several techniques for expertise ranking are available but a mapping to professional scales is still under study
  • The HITS and SPEAR algorithms are good enough for most of cases
    • Frequency is not an accurate technique
  • The Skillrank technique is a very promising process but needs more testing and refinement
    • Existing social networks provide us an unique opportunity
  • The 4-scale avoids the central trend.
  • ...

Future Work

  • Refinement of the Skillrank technique

  • Overcome the research limitations

    • Creation of larger datasets for training and test purposes

    • Test campaign

  • Comparison of skills in different professional social networks

    • Xing, Quora, Stackoverflow, etc.

  • Make the source code and datasets publicly available under the principles of OpenScience

  • ...

Q & A

Credits

  • Prof. Dr. Ricardo Colomo-Palacios
  • Faculty of Computer Science, Østfold University College, Halden, NORWAY

  • E-mail: ricardo.colomo@hiof.no
  • WWW: 

Assesing professional skills in a multi-scale environment by means of graph-based algorithms

By Jose María Alvarez

Assesing professional skills in a multi-scale environment by means of graph-based algorithms

  • 2,500