CV redaction

As hiring team, we want anonymised CVs, so we can go through a fairer and more objective screening process.

1) How can we programmatically detect things that need redacting in a CV and then how do we redact them

2) How can we do that redaction in a scalable and automated way

CV redaction

  • Build vs Buy
  • Word detection
  • Word redaction
  • Putting it all together

Build vs Buy

Word detection

  • NLP (natural language processing)
  • Word matching?

Word matching - regex

^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$

Word redaction

Word redaction

Tying it all together

1) How can we programmatically detect things that need redacting in a CV and then how do we redact them

2) How can we do that redaction in a scalable and automated way

The infrastructure

Check it meets our restrictions on size and type

1️⃣

Scan it for viruses

2️⃣

Upload the raw file to our file store

3️⃣

Convert the file to pdf

4️⃣

Upload the pdf file to our file store

5️⃣

Redact the CV

6️⃣

Upload the redacted file to our file store

7️⃣

How (most) of the internet works

Client

💻

How (most) of the internet works

{

    "email":"hewisCool@beapplied.com",

    "password":"secretPassword123"

}

POST

    https://app.beapplied.com/login-api

How (most) of the internet works

Server

📠

Reads request

Checks if a user with that email address

If it does checks the password is correct

If it is - ok! Return to the use

How (most) of the internet works

Client

💻

Check it meets our restrictions on size and type

1️⃣

Scan it for viruses

2️⃣

Upload the raw file to our file store

3️⃣

Convert the file to pdf

4️⃣

Upload the pdf file to our file store

5️⃣

Redact the CV

6️⃣

Upload the redacted file to our file store

7️⃣

Synchronous

Ascync

Demo