Databases

options

  1. Percona
  2. MariaDB
  3. PostgreSQL

Percona

www.percona.com

Percona Server is an enhanced drop-in replacement for MySQL

  • Your queries will run faster and more consistently
  • You will consolidate servers on powerful hardware
  • You will delay sharding, or avoid it entirely
  • You will save money on hosting fees and power
  • You will spend less time tuning and administering
  • You will achieve higher uptime
  • You will troubleshoot without guesswork

Percona

  • True drop in replacement
  • Current stable version based on MySQL 5.6
  • Really lacking in documentation (IMO)
  • XtraDB
  • Enhanced performance_schema

MariaDB

www.mariadb.com

MariaDB 10 is the nexus for open source database innovation in the Cloud age. It delivers greater agility, cost-effective scale and high performance to developers

MariaDB

  • Drop in replacement
  • Current version is based on MySQL 5.5
  • Includes back ported patches,
  • Lots of storage engines
  • Including XtraDB

storage engines

  • Aria (MyISAM)
  • XtraDB (INNODB)
  • FederatedX (Federated)
  • SphinxSE (Text search)
  • TokuDB
  • Cassandra
  • CONNECT
  • SEQUENCE
  • Spider

postgresql

www.postgresql.org

PostgreSQL is a powerful, open source object-relational database system. It has more than 15 years of active development and a proven architecture that has earned it a strong reputation for reliability, data integrity, and correctness

PostgreSQL

  • Vague feeling of superiority
  • Tales of woe trying to migrate from MySQL
  • If we were starting again ... ?

problems

  • How to benchmark (without taking weeks)?
  • Not on desktop VM
  • Spindles/memory is a killer - sb4 resolves that

sb4 buffer pool


LOG waits


Write load


Requests


conclusions

Our main wins are going to be through:
  • Schema changes
  • Query changes
  • Configuration changes

Percona/MariaDB can give us small enhancements for 'free'
  • XtraDB drop in worthwhile trying
  • Use of sampling to monitor queries

performance_schema enhancements nice... but a lot of work

Performance schema


small benefits

Slow log
# Time: 140729 16:24:41
# User@Host: lokku[lokku] @ localhost []  Id: 4975130
# Schema: br_realestate  Last_errno: 0  Killed: 0
# Query_time: 4.251948  Lock_time: 0.000119  Rows_sent: 1220746  Rows_examined: 4717181  Rows_affected: 0
# Bytes_sent: 4882675

small benefits

slow log sampling
http://www.percona.com/doc/percona-server/5.6/diagnostics/slow_extended.html

  • Just collect certain types of queries (full_scans, filesorts, etc)
  • Log every X query to find general usage patterns
  • Turns out we do a lot of 'pings'

example queries

# Query_time: 1.028571  Lock_time: 0.000060  Rows_sent: 400  Rows_examined: 1022446  Rows_affected: 0
# Bytes_sent: 1659

/* get_batch_of_deduper_jobs */
        SELECT SQL_CALC_FOUND_ROWS
            id
        FROM
            listings
        WHERE
            is_queued_for_geocoding=0 AND
            is_queued_for_nlp=0 AND
            is_queued_for_deduping=1
        ORDER BY
            int_time_etl_inserted
        DESC
        LIMIT
            400; 

example queries

# Query_time: 1.849355  Lock_time: 0.006700  Rows_sent: 9998  Rows_examined: 39992  Rows_affected: 0
# Bytes_sent: 8128902

/* get_full_listings_by_id */
        SELECT
             *, listings.id as id, listings.title as title, listers.id as lister_id, listers.name as lister_name, listers.company as lister_company, datasources.id as datasource_id, datasources.name as datasource_
name , datasources.title as datasource_title, datasources.provider as datasource_provider FROM listings, listers, realestate_attributes, datasources
        WHERE
            listers.id = listings.lister_id   AND realestate_attributes.id = listings.id  AND datasources.id = listings.datasource_id
        AND
            listings.id in (148469654,145139743,[...])










asdf
a
faf
asdf 






top (slow) requests

   10282 /* get_full_listings_by_id */   2106 /* get_batch_of_deduper_jobs */
   1645 /* is_lid_duplicated_with_other_datasource 
   1311 /* insert_mutex_listings */ 
   1248 /* get_eligible_listings_in_box_and_price
   1118 /* lis_get_fingerprint_data_for_hash */
    968 /* set_nlp_result */ 
    761 /* set_geocoder_result */
    746 /* lis_gain_locks */
    540 /* get_mutex_duplicates_per_listing_id *
    369 /* _lis_sm_pushing */
    341 /* lis_get_random_image_url */
    322 /* get_pre_import_data_for_existing_list
    307 /* _change_listing */ 
    289 /* get_cache_data */ 
    281 /* get_batch_of_geocoding_jobs */
    276 /* lis_photo_urls_sth */
    250 /* _lis_confirm_and_get_job */
    221 /* insert_daemon_metrics */  
    212 /* lis_live_images_sth */

Databases

By chris_lokku

Databases

  • 463