Assessing professional skills in a multi-scale environment

by means of

graph-based algorithms

Jose María Alvarez-Rodríguez and Ricardo Colomo-Palacios

The First European Network Intelligence Conference,

September 29-30, 2014 - Wroclaw (Poland)

The Problem...

Can someone endorse both "Sales" and "Java"?

Why/When the skill is endorsed?

Can we trust in such skills?

How skills are spread?

Can we align skills to an existing

expertise scale or ranking?

...

What is the meaning of "30" in "Semantic Web"?

...multiple expertise scales...

“Professional Development Ladder”, by Contrux Software, 4-scale classification of expertise

The Individual Competency Index (ICI)

The ICI index.

A 5-scale competency index.

...in other words...

Skill explosion

Endorsements of multiple skills
...from different profiles to others
...triggered by:
- Professional/Personal relationships
- Professional/Personal interests
- Professional/Personal education
- User's activity in the social network
  - The most active user is the most endorsements she receives...
- ...

How to...

Establish a reliable quantitative value for every endorsed skill...

Align quantified skills to an existing scale?

Which scale should be selected? 4 or 5 level?

Potential Impact

Expertise ranking and retrieval

Human Resources management

Organizational

knowledge capitalization

Currently on the web...

Meritocracy...

A lot of related work...

Authoritative Sources in a Hyperlinked Environment. The HITS Algorithm. (Kleinberg, 1998)
Expertise Networks in Online Communities: Structure and Algorithms (B. Dom et al., 2003)
Graph-based ranking algorithms for e-mail expertise analysis. (B. Dom et al., 2003)
Expertise Identification using Email Communications. (Campbell et al., 2003)
Telling Experts from Spammers: Expertise Ranking in Folksonomies (The SPEAR algorithm). (Noll et al., 2009)
On the assessment of expertise profiles. (Berendsen et al., 2013)
...

Advantages

Widely studied and well-known techniques
- Graph-based
Focus on expertise retrieval
- Moving from document to expertise retrieval
Based on collaborative tagging
...

Drawbacks

Few studies of existing professional social networks
No alignment to existing professional skills
Lack of attention to user's activity
Quantitative and qualitative mappings are not completely covered
...

Main Contributions...

Application of existing graph-based techniques to rank skills

```
The HITS and SPEAR algorithms
```

Adaptation of the SPEAR algorithm to professional social networks

```
The Skillrank technique
```

Study of user's activity and interactions

Mapping to existing professional scales

4/5 scale under study to avoid central tendency

Case study of a restricted dataset generated from the Linkedin API

The HITS algorithm

Background

"-A good hub increases the authority weight of the pages it points. 
-A good authority increases the hub weight of the pages that point to it."

Source: http://www.math.cornell.edu/~mec/Winter2009/RalucaRemus/Lecture4/lecture4.html

The SPEAR Algorithm

Background

Mutual reinforcement of user expertise and document quality (collaborative tagging)
Discoverers vs. followers

Source: http://www.michael-noll.com/projects/spear-algorithm/

The Skillrank technique

An interpretation of

the SPEAR algorithm...

tagging users

instead of documents

A social network as a

community C...

...but a community is also the union of several sub-communities

2 Contexts: Local (MyLinkedin) vs Global (Linkedin).

...and user's activity

(local context)

Correlated-endorsements to the same user in a sub-community.

Local Context (II)

Correlated-endorsements to different user in a sub-community

Independent endorsements to the same user in a sub-community.

Global Context

Correlated-endorsements to different user in a community

Independent endorsements to different user in a community

The Skillrank technique

Re-interpreting the SPEAR algorithm

The vector of expertise and quality scores

...naming change?

~~Document~~/User

Two-step process
- From a local expertise assessment to a global one

"0 represents that a user (Um) has not yet endorsed using a skill (Sk) while another value such 2 in cell represents that an user endorsed an skill before another user."

Case Study

A Professional Social Network

Research Design

Select techniques to make a comparison of the precision
1. The HITS, SPEAR and Skillrank techniques
Select and prepare dataset.
- 10 users (average of 50 connections) and 5 skills
Create a dataset for unit testing purposes
- Panel of experts has established the different levels in both scales (4 and 5)
Inclusion of the frequency as a basic technique for each user and skill
Run the techniques for every dataset and technique
Calculate the percentiles (quartiles and quintiles) for every user, skill and technique to align to scales.
1. Extract precision and compare

Request to Linkedin

https://api.linkedin.com/v1/people/~/connections:(id,first-name,last-name,formatted-name,
email-address,headline,industry,location,num-connections,summary,specialties,positions,
site-standard-profile-request,public-profile-url,api-standard-profile-request,proposal-
comments,associations,honors,interests,publications,patents,languages,skills:
(id,skill,proficiency,years),certifications,educations,courses,volunteer,
three-current-positions,num-recommenders,following,job-bookmarks,date-of-birth,
member-url-resources,connections)

My network (+450 connections) was asked to execute this command and save 
the XML output (20).

Skills selection

Technical and business ones

ICI Index

Professional Development Ladder

Frequency as

basic measure

Number of endorsements that have received an user in a certain skill

Number of maximum endorsements that can receive that user in that skill

% of correct classified skill for every

skill and technique

Graphical view of the % of correct classified skills per technique and type of scale

Research Limitations

The Linkedin API restricts the visibility of some relevant fields if you are not 1st degree connection.
The dataset is restricted to a very few number of skills and users.
The mapping to the professional scales should be done by a wider panel of experts.
The average is not the most adequate aggregation operator.
A full test campaign should be executed to ensure the parameters of the different techniques.
...

Conclusions

Several techniques for expertise ranking are available but a mapping to professional scales is still under study

The HITS and SPEAR algorithms are good enough for most of cases

```
Frequency is not an accurate technique
```

The Skillrank technique is a very promising process but needs more testing and refinement

Existing social networks provide us an unique opportunity

```
The 4-scale avoids the central trend.
```
...

Future Work

Refinement of the Skillrank technique
Overcome the research limitations
- Creation of larger datasets for training and test purposes
- Test campaign
Comparison of skills in different professional social networks
- Xing, Quora, Stackoverflow, etc.
Make the source code and datasets publicly available under the principles of OpenScience
...

Q & A

Credits

Dr. Jose María Alvarez-Rodríguez
Carlos III University of Madrid, Spain
E-mail: josemaria.alvarez@uc3m.es
WWW:
- http://purl.org/krgroup/web
- http://www.josemalvarez.es

Prof. Dr. Ricardo Colomo-Palacios
Faculty of Computer Science, Østfold University College, Halden, NORWAY
E-mail: ricardo.colomo@hiof.no
WWW:
- http://www.rcolomo.com/

Assesing professional skills in a multi-scale environment by means of graph-based algorithms

By Jose María Alvarez

Assesing professional skills in a multi-scale environment by means of graph-based algorithms

2,500

Assessing professional skills in a multi-scale environment

by means of

graph-based algorithms

The Problem...

What is the meaning of "30" in "Semantic Web"?

...multiple expertise scales...

The Individual Competency Index (ICI)

...in other words...

Skill explosion

How to...

Establish a reliable quantitative value for every endorsed skill...

Align quantified skills to an existing scale?

Potential Impact

Expertise ranking and retrieval

Human Resources management

Organizational

knowledge capitalization

Currently on the web...

Meritocracy...

A lot of related work...

Advantages

Drawbacks

Main Contributions...

The HITS algorithm

Background

The SPEAR Algorithm

Background

The Skillrank technique

An interpretation of

the SPEAR algorithm...

tagging users

instead of documents

A social network as a

community C...

...but a community is also the union of several sub-communities

...and user's activity

(local context)

Local Context (II)

Global Context

The Skillrank technique

Re-interpreting the SPEAR algorithm

Case Study

Linkedin

Research Design

Request to Linkedin

Skills selection

ICI Index

Professional Development Ladder

Frequency as

basic measure

% of correct classified skill for every

skill and technique

Graphical view of the % of correct classified skills per technique and type of scale

Research Limitations

Conclusions

Future Work

Q & A

Credits

Assesing professional skills in a multi-scale environment by means of graph-based algorithms

More from Jose María Alvarez