by Evgenii Nikitin
Operational (OLTP) - operations accounting. Uses row-store tables.
Analytical (OLAP) - data analysis. Uses column-store tables.
Row-store (OLTP) | Column-store (OLAP) |
---|---|
Data is stored on the disk but then loaded to the RAM where it is read from. | Data is stored on the disk. Data is read from the disk. |
Has longer recovery times | Data is ready right after DB start |
Is optimized to updated frequently and at random. | Is optimized for batch UPDATE and DELETE queries. |
Searches fast arbitrary data. | Works effectively with reading of big amount of consistent data. |
Supports few indexes. | Supports one index per table only. |
Data is stored as-is. More space is needed for storage then source data. | Data is compressed. Less amount of storage is needed than source data especially with pre-sorting. |
By default rowstore tables are created. | Column tables are simply defined by key CLUSTERED COLUMNSTORE. |
1 UNIT - 8 vCPU and 32GB RAM
MemSQL - commercial database
4 UNITS - FOR FREE!
Environment: 6 vCPU, 16 GB RAM with SSD hard drive, Ubuntu 18.04.
Databases: MySQL 5.7.29 InnoDB, MemSQL 7.0.
Tables: MySQL rowstore, MemSQL rowstore, MemSQL columnstore.
Tests: loading, reading, updating, removing.
Measure: milliseconds.
Import of 11 million lines from 1.5 GB csv file
Datasets: 11 000 000, 1 000 000, 50 000, 1 000 rows.
Number of run: 3
Cases: Rowstore tables with primary key, with indexes and without indexes, Columnstore with primary key only.
Search by primary key.
SELECT * FROM yellow_tripdata_staging WHERE id = 50
Search by parameter.
SELECT * FROM yellow_tripdata_staging WHERE payment_type = 2
Search by range
SELECT * FROM yellow_tripdata_staging WHERE fare_amount >= 10 AND fare_amount <= 20
Calculation of average value
SELECT avg(fare_amount) FROM yellow_tripdata_staging WHERE payment_type = 2
Calculation of amount of items
SELECT count(*) FROM yellow_tripdata_staging WHERE payment_type = 2
Calculation of amount of items with grouping
SELECT COUNT(tip_amount), payment_type FROM yellow_tripdata_staging GROUP BY payment_type;
Search of the string
SELECT * FROM yellow_tripdata_staging WHERE store_and_fwd_flag LIKE 'Y%'
Using a small limit
SELECT payment_type, pickup_longitude, pickup_latitude FROM yellow_tripdata_staging WHERE payment_type > 3 LIMIT 100
Using a big limit
SELECT payment_type, pickup_longitude, pickup_latitude FROM yellow_tripdata_staging WHERE payment_type > 3 LIMIT 100000
Two inner joins
SELECT pickup_longitude, pickup_latitude, r.name, p.name FROM yellow_tripdata_staging as t INNER JOIN rate_code as r ON t.rate_code_id = r.rate_code_id
INNER JOIN payment_type as p ON t.payment_type = p.payment_type
UPDATE yellow_tripdata_staging SET tip_amount = 0, tolls_amount = 0 WHERE id = %d
(10 000 rows have been updated)
UPDATE yellow_tripdata_staging SET tip_amount = 0, tolls_amount = 0
WHERE id >= 1 AND id <= 10000
Insert 10 000 rows - one by one
Insert 10 000 rows - 10 times by 1000 rows
Performance of MemSQL and MySQL highly depends on the size of tables and usage of primary key in requests.
Scope of application of these databases:
MemSQL (cold cache) : 1.5 - 2.6 s
MySQL (cold cache) : 2.3 - 2.7 s
MemSQL (hot cache) : 0.25 - 0.5 s
MySQL (hot cache) : 0.2 - 0.4 s
Real-time analytics platforms
Dashboards
Search system by parameters (hotels, planes)
Systems with frequently updated data
System requirements:
4 CPU x86_64, 8 GB RAM
RHEL/CentOS 6, Debian 8+ (Ubuntu) or Docker
TCP ports:
3306 - is used by aggregator.
3307 - is used by leaves.
8080 - default port for MemSQL Studio
Installation guide:
https://docs.memsql.com/v7.0/guides/deploy-memsql/memsql-tools/deploy-cluster/step-1/
MemSQL is protocol compatible with MySQL.
Database management application:
Any MySQL compatible (Sequel Pro)
PHP Drivers:
PDO_MySQL (requirement for Drupal)
MySQLi
composer require drupal/memsql
"post-update-cmd": [
"cp -r web/modules/contrib/memsql/drivers web/"
]
Installation:
Add to composer.json to update driver folder:
Copy driver to docroot:
cp -r web/modules/contrib/memsql/drivers web/
$databases['default']['default'] = [
'database' => 'drupal',
'username' => 'root',
'password' => '',
'prefix' => '',
'host' => '127.0.0.1',
'port' => '3306',
'namespace' => 'Drupal\\Driver\\Database\\memsql',
'driver' => 'memsql',
];
settings.php
Drupal team leader at Smile Open Source Solutions