The hard battle for scalability and performance

...or why building a chat application is hard

Context

Who are they?

  • Pure Marketplace
    • No stocks (new project soon)
    • 100% online business
    • Take a commission on every purchase

 

  • Main Growth KPIs
    • Business Value = Amount exchanged
    • Net Promoter Score = Customer Satisfaction

Growth 💸

  • 10 Million unique visitors / month
  • 1 Million BV per day crossed in 2017
  • 2 Million BV per day crossed 2nd July 2018

 

  • +250 million BV total 2017
  • Objective 2018: +500 million BV
  • Objective 2019: +1 billion BV

 

  • 60 million fundraising in 2017
  • New fundraising in 2018

Why a chat widget?

  • Increase the NPS to increase sales
  • Better help for after sale, and before sale
    • Independant "Manodvizors"
    • Multiple conversations simultaneously
    • Customized help
    • Future: video chat, help on DIY projects
  • Direct questions, realtime
  • Increase transformation rate

Chat

=

Differentiating Feature

 

Codename Bengal

demo

Tech Stack

& tooling

Prerequisites

  • Use a SaaS to manage realtime to avoid typical issues
  • Should be multilingual
  • Should scale
  • Should not be strongly tied to the website

Tech challenge

🗓 Sept 2017 - Start of development

  • 3 days (ouch)
  • Tools already benchmarked
    • Pusher
    • PubNub
  • Existing infra used Vagrant
    • Tooling on Vagrant
    • Easy to ask a VM to their OPS

First version

🗓 Dec 17 - First chats in Italy

1 Read-only + 1 Write-only + N private Pubnub channels

Dialogflow for the qualification bot

React for both front

Single node application for historic, routing, queuing

Elasticsearch for storing conversations data

Redis for storing DialogFlow session data

All "events" travel through Pubnub

First version

Fail!

  • PubNub Chat Engine not ready
  • Misuse of PubNub
    • Subscribers limited to 50 channels
  • Bugs

    • Pooling of transaction was buggy

    • Vulnerability not fixed
  • Very hard to debug

Learnings 📚

  • Talk to and challenge your SaaS provider early
  • Event based programming is hard

Second version

🗓 March 18 - Full deploy in Italy

🗓 April 18 - First Italian Manodvisors

Channel groups Pubnub for realtime (10 000 channels per channel group)

Multiple Node microservices for historic, routing, queuing

Only conversation related events go through Pubnub

second version

Better, but still, fail!

Other bugs in PubNub

PubNub latencies

PubNub uptime

PubNub = Major point of failure

Learnings 📚📚📚

  • Talk to and challenge your SaaS provider early, again!
  • Murphy's Law : "whatever can go wrong, will go wrong"
    • Think fallbacks
  • Exception management is hard but paramount
  • Log management is crucial
  • On complex topics, do not hesitate to specialize yourself
  • Challenge the pre-requisites of your project

Third version

 🗓 June 18 - Start deploying in France

 🗓 July 18 - Full deploy in France, with Manodvisors

 🗓 Sept 18 - Full deploy in the UK, with Manodvisors

Websocket server for realtime events

Multiple Node micro-services for historic, routing, queuing

Only conversation related events go through websockets

 

Learnings

 

  • Browser do not handle well backgrounded tabs
  • Communicate on your application status and on your updates

Evolution of overall architecture

  • Pusher
  • PubNub chat engine
  • PubNub (no channel groups) + wrapper
  • PubNub (channel groups) + wrapper with Microservices
  • PubNub with In house websockets + Microservices
  • Future: all browser call are Rest, all answers are sockets

 

Allowed by a "wrapper" pattern

Some other issues

DialogFlow

  • API.AI
  • Bought by Google
  • Powerful but...
    • Work in progress
    • No API to export/import agents
    • Can fail (multi-language)

Edge cases

  • Navigation
  • Multi-tab navigation
  • Multi browser navigation
  • Bad connection of client
  • Bad connection of customer service
  • What happens when someone lose their connection?
  • Queuing clients

Conclusion

Develop quickly...

Build your app on SaaS...

and sturdily

but always think failures

Log everything you can...

but always add context

Good architecture is paramount...

but you can always change it, and should keep challenging it

For ManoMano

  • Balance between long lived conception and pragmatism
  • Problem solving
  • The importance of being one team
  • Transparency to the users

 

DATA!

📈 📊

Chat distribution - Total chats - Dashboard - Manodvisors

Celebrations 🎉

  • ManoMano now has a working, evolutive chat application
  • The team kept its deadlines
  • Theodo and ManoMano learned to work together

 

5+ new teams on other core projets started since the chat

AfterBuy

AB Testing

Seller Toolbox

Warehouse

Recommendation (data)

Checkout (B2B)

Thank you!