Inside Pandora(v1)

2016.11.10

Jesse Fang

Architecture

Nginx

IIS

Node.js

ElasticSearch Cluster

Linqpad Driver

Pandora Website

Data Importer

SML

Elastic Serach

Logstash

RESTful JSON API

SlapiEval Engine

Front-End

Back-End

ElasticSearch

  • Build on Apache Lucene, open source, 19242 stars in GitHub
  • Distributed, scalable, and highly available

  • Near-Realtime full-text search

  • Document-oriented & schema free

  • RESTful API & structured query DSL

  • Elastic Stack

     

Setup ElasticSearch in 1 mins

ELK in Pandora

IIS

Node.js

Elastic Serach

Logstash

SlapiEval Engine

Query Log

[2016-10-19 01:22:49.973] [INFO] query-logger - F******T\y******u, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":"bing","keyword":"RootDataSource","page":"0","source":"pandora"}, IsNoResult:true, Time:1267
[2016-10-19 01:23:03.431] [INFO] query-logger - F******T\y******u, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":"bing","keyword":"RootDataSource != null","page":"0","source":"pandora"}, IsNoResult:true, Time:403
[2016-10-19 01:23:10.513] [INFO] query-logger - F******T\y******u, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":"bing","keyword":"Service","page":"0","source":"pandora"}, IsNoResult:false, Time:868
[2016-10-19 01:23:12.444] [INFO] query-logger - F******T\j******g, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"*","page":"0","source":"pandora"}, IsNoResult:false, Time:574
[2016-10-19 01:23:47.332] [INFO] query-logger - F******T\y******u, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":"bing","keyword":"SLAPI_EVALUATED.RootDataSource != null","page":"0","source":"pandora"}, IsNoResult:true, Time:324
[2016-10-19 01:24:04.499] [INFO] query-logger - F******T\y******u, Agg:Service, IsNoResult:true, Time:181
[2016-10-19 01:24:16.103] [INFO] query-logger - F******T\y******u, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":"bing","keyword":"UserLocation","page":"0","source":"pandora"}, IsNoResult:false, Time:969
[2016-10-19 01:24:22.531] [INFO] query-logger - F******T\y******u, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"UserLocation","page":"0","source":"pandora"}, IsNoResult:false, Time:739
[2016-10-19 01:24:35.689] [INFO] query-logger - F******T\y******u, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"UserLocation","page":"4","source":"pandora"}, IsNoResult:false, Time:446
[2016-10-19 01:24:55.560] [INFO] query-logger - F******T\j******g, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"*","page":"0","source":"pandora"}, IsNoResult:false, Time:570
[2016-10-19 01:24:55.990] [INFO] query-logger - F******T\m******a, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"*","page":"0","source":"pandora"}, IsNoResult:false, Time:682
[2016-10-19 01:25:46.088] [INFO] query-logger - F******T\y******u, Agg:RootDataSource, IsNoResult:true, Time:192
[2016-10-19 01:26:05.836] [INFO] query-logger - F******T\y******u, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"UserLocation","page":"0","source":"pandora"}, IsNoResult:false, Time:382
[2016-10-19 01:26:59.910] [INFO] query-logger - F******T\m******a, Source:linqpad, Params:{"keyword":"messi","dataset":"bing","market":"en-us","vertical":"web","pageSize":"10","page":"0","abnormal":"false","autosuggest":"false","source":"linqpad"}, IsNoResult:false, Time:782
[2016-10-19 01:27:57.141] [INFO] query-logger - F******T\j******g, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"Page.SmartSearch.AS.Suggestions","page":"0","source":"pandora"}, IsNoResult:false, Time:729
[2016-10-19 01:28:54.076] [INFO] query-logger - F******T\j******g, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"Suggestions","page":"0","source":"pandora"}, IsNoResult:false, Time:1716
[2016-10-19 01:29:46.895] [INFO] query-logger - F******T\y******u, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"UserLocation","page":"2","source":"pandora"}, IsNoResult:false, Time:423
[2016-10-19 01:48:50.986] [INFO] query-logger - F******T\m******a, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"machine","page":"0","source":"pandora"}, IsNoResult:false, Time:1356
[2016-10-19 01:57:10.704] [INFO] query-logger - F******T\j******g, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"Suggestions","page":"0","source":"pandora"}, IsNoResult:false, Time:458
[2016-10-19 02:35:44.219] [INFO] query-logger - F******T\j******l, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"*","page":"0","source":"pandora"}, IsNoResult:false, Time:556
[2016-10-19 02:41:51.598] [INFO] query-logger - F******T\j******l, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"*","page":"0","source":"pandora"}, IsNoResult:false, Time:578
[2016-10-19 02:42:14.646] [INFO] query-logger - F******T\j******l, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"*","page":"0","source":"pandora"}, IsNoResult:false, Time:719
[2016-10-19 07:09:52.757] [INFO] query-logger - F******T\j******g, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"*","page":"0","source":"pandora"}, IsNoResult:false, Time:593
[2016-10-19 13:09:55.036] [INFO] query-logger - R******D\x******s, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"CoreUX_LB_Response_From_ATLA_Chunk2","page":"0","source":"pandora"}, IsNoResult:false, Time:2219
[2016-10-19 21:02:20.166] [INFO] query-logger - F******T\m******a, Agg:Page_Name, IsNoResult:false, Time:281
[2016-10-19 21:02:30.457] [INFO] query-logger - F******T\m******a, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"Page_Name:page.noresults","page":"0","source":"pandora","addloginsource":"1","addloginresponse":"1"}, IsNoResult:false, Time:657
[2016-10-19 21:06:37.307] [INFO] query-logger - F******T\m******a, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"Page_Name:page.noresults","market":"en-US","page":"0","source":"pandora","addloginsource":"1","addloginresponse":"1"}, IsNoResult:false, Time:640
[2016-10-19 22:27:56.184] [INFO] query-logger - F******T\j******g, Agg:Page_Name, IsNoResult:false, Time:313
[2016-10-20 00:16:23.163] [INFO] query-logger - F******T\v******o, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":"bing","keyword":"www.sogou.com/websearch/xml","market":"zh-cn","page":"0","source":"pandora","vertical":"web"}, IsNoResult:false, Time:750
[2016-10-20 01:18:01.702] [INFO] query-logger - F******T\j******g, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"*","page":"0","source":"pandora"}, IsNoResult:false, Time:656
[2016-10-20 01:21:19.443] [INFO] query-logger - F******T\m******a, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"*","market":"en-US","page":"0","source":"pandora"}, IsNoResult:false, Time:638
[2016-10-20 01:21:48.741] [INFO] query-logger - F******T\m******a, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"*","market":"en-US","page":"1","source":"pandora"}, IsNoResult:false, Time:469
[2016-10-20 01:22:12.103] [INFO] query-logger - F******T\m******a, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"*","market":"en-US","page":"3","source":"pandora"}, IsNoResult:false, Time:985
[2016-10-20 04:21:04.769] [INFO] query-logger - F******T\j******l, Source:pandora, Params:{"abnormal":"false","autosuggest":"false","dataset":["bing","mobile","cortana"],"keyword":"*","page":"0","source":"pandora"}, IsNoResult:false, Time:610

Logstash Config

input {
    file {
        type => "query"
        path => "D:/Pandora/logs/pandora-query.log"
        start_position => beginning 
        ignore_older => 0
    }
}

filter {
    if [type] == "query" {
        grok {
            match => { "message" => "\[%{TIMESTAMP_ISO8601:logtimeString}\] \[%{LOGLEVEL:level}\] query-logger - (%{USERNAME:domain}\\)?%{USERNAME:alias}, Params:%{GREEDYDATA:params}, IsNoResult:%{WORD:isNoResult}, Time:%{NONNEGINT:duration:int}"}
        }
        json {
            source => "params"
        }
        date {
            match => ["logtimeString", "YYYY-MM-dd HH:mm:ss.SSS"]
            target => "logtime"
        }
    }
}

output {
    stdout {}
    if [type] == "query" {
        elasticsearch {
            hosts => "lsstc451"
            # NOTE: index name must be lower case
            index => "pandora_query_log-%{+YYYY}"
        }
    }
}

Kibana

Some Traps & Tips

  • Field type with same field name across all indices MUST be the same
  • Dots are NOT allowed in field name(before ES 2.4.x)
  • Word Segmentation
  • Breaking changes between different versions
  • Always define the mapping
  • Avoid mapping explosion
  • Cluster structure design (new Ingest node in ES 5.x)    
  • Plan for index growth (time-based data)
  • Parameter tuning for different environment
  • Useful tools:

Web Front-End

  • Deep separation of front-end and back-end
  • Componentization
  • Modularization
  • Data driven
  • From back-end MVC to front-end MVP / MVVM
  • Testable

Beyond Browser

Develop once, deploy everywhere

Nginx

RESTful JSON API

Pandora as a Service

Pandora

Linqpad Driver

Pandora Website

Thanks

Inside Pandora(v1)

By Jesse Fang

Inside Pandora(v1)

Introduction of technologies used in Pandora v1.0

  • 1,158