Koha SQL training
Alex Buckley
Feb 2024
Agenda
- Introductions
- Writing SQL reports
- Resources for writing SQL reports
- SQL essentials
- SQL that sits behind the Statistics wizard
- Linking tables/creating SQL reports that use more than one Koha module
- SQL reports for your library
- Q & A
Introductions
- Who you are
- How long you have been using Koha
- What you hope to get out of this workshop
Alex Buckley
- Rōpū kohinga developer at Catalyst since 2015
- Based in Nelson
- Data migrations, new Koha and VuFind implementations, integrations, training
Test sites for today's training
- User:staff
- Password:staff1
(If you don't have a test site to use)
Writing SQL reports
Resources for writing SQL reports
Koha reports wiki
Resources for writing SQL reports
Koha Database Schema
- Shows database tables and columns
- Shows relationships between tables
- Every table (set of data) has columns (attributes)
- Each row in a table represents a record
- Sometimes tables are linked by a shared column (key)
Writing SQL reports
Important
- Always run your report on your test site before trying it in production.
- Do not query the action_logs table, use the Log Viewer Tool instead.
- Do not refresh the page while running a report, and do not press Run more than once.
- Reports that are too vague may time out or overload the server.
- The Koha database structure may change between versions. Old reports may not work in recent Koha versions, and vice versa.
- You can only view (not modify) data using Koha reports.
Useful tables
- borrowers, deletedborrowers - patron/member data
- items, deleteditems - holdings data
- biblio, deletedbiblio - bibliographic data
- auth_header - authority data
- reserves, old_reserves - reserves/holds data
- issues, old_issues - circulation data
- branchtransfers - transfer data
- accountlines - fines and fees data
- statistics
- serial - specific serial issue data
- subscription
SQL essentials
SELECT query
SELECT
SELECT table.column FROM table
SELECT count(*) FROM table
SELECT table.column1, table.column2, table.column3 FROM table
SELECT table.* FROM table
SELECT
SELECT biblio.biblionumber FROM biblio
SELECT count(*) FROM items
SELECT borrowers.firstname, borrowers.surname, borrowers.email FROM borrowers
WHERE
SELECT count(*) FROM table WHERE table.column = "value"
SELECT table.column1, table.column 2 FROM table WHERE table.column IS NULL
SELECT table.column1 FROM table WHERE table.column != "value"
WHERE
SELECT count(*) FROM statistics WHERE statistics.type = "issue" LIMIT 10;
SELECT count(*) FROM items WHERE items.barcode IS NULL
SELECT count(*) FROM reserves WHERE reserves.itemnumber IS NOT NULL
AS
SELECT table.column AS "Column name" FROM table SELECT table.column AS ColumnName FROM table
SELECT borrowers.firstname AS "First name" FROM borrowers SELECT borrowers.surname AS Surname FROM borrowers
LIKE with wildcards
SELECT table.column FROM table WHERE table.column LIKE "%middle%";
SELECT table.column FROM table WHERE table.column NOT LIKE "%value%";
SELECT table.column FROM table WHERE table.column LIKE "%end";
SELECT table.column FROM table WHERE table.column LIKE "start%";
LIKE with wildcards
SELECT borrowers.surname AS Surname, borrowers.firstname AS "First Name", borrowers.cardnumber AS "Card Number", borrowers.email AS Email FROM borrowers WHERE borrowers.email NOT LIKE '%@%.%'
SELECT biblio.title FROM biblio WHERE biblio.title LIKE "%dog%";
DISTINCT
SELECT DISTINCT(table.column) FROM table
DISTINCT
SELECT DISTINCT(items.itype) FROM items
WHERE with AND
SELECT table.column1, table.column2 FROM table WHERE table.column = 'data' AND table.column != 'data' AND table.column >= 'number'
WHERE with AND
SELECT items.itemnumber, items.barcode FROM items WHERE items.homebranch = 'CPL' AND items.ccode = 'FIC'
SELECT items.* FROM items WHERE items.ccode = 'FIC' AND items.dateaccessioned > '2021';
WHERE with OR
SELECT table.column1, table.column2 FROM table WHERE table.column = "data" OR table.column IS NOT NULL
WHERE with OR
SELECT items.biblionumber, items.itemnumber FROM items WHERE items.itemlost = 1 OR items.damaged = 1 OR items.withdrawn = 1 OR items.notforloan = 1
RUNTIME PARAMETERS
SELECT table.column FROM table WHERE table.column = <<Label for text input>>
SELECT table.column FROM table WHERE table.column = <<Label for dropdown|dropdown>>
RUNTIME PARAMETERS
RUNTIME PARAMETERS
SELECT biblio.biblionumber, biblio.title FROM biblio WHERE biblio.frameworkcode = <<Framework code|biblio_framework>>
SELECT items.itemnumber, items.barcode FROM items WHERE items.ccode = <<Collection|CCODE>>
ORDER BY
SELECT table.column FROM table WHERE table.column = "data" ORDER BY table.column ASC
SELECT table.column FROM table WHERE table.column = "data" ORDER BY table.column DESC
ORDER BY
SELECT borrowers.surname, borrowers.firstname FROM borrowers WHERE borrowers.branchcode = "CPL" ORDER BY borrowers.surname ASC
SELECT borrowers.surname, borrowers.firstname FROM borrowers WHERE borrowers.branchcode = "CPL" ORDER BY borrowers.firstname DESC
LIMIT
SELECT table.column FROM table LIMIT num_of_rows
LIMIT
SELECT biblio.* FROM biblio WHERE biblio.datecreated > '2020' LIMIT 5;
SELECT biblio.* FROM biblio WHERE biblio.datecreated > '2020' ORDER BY biblio.title ASC LIMIT 5;
GROUP BY
SELECT COUNT(table.column), table.column FROM table GROUP BY table.column
SELECT SUM(table.column), table.column FROM table GROUP BY table.column
GROUP BY
SELECT COUNT(items.itemcallnumber) AS 'Number of Items', items.itemcallnumber FROM items GROUP BY items.itemcallnumber
HAVING
HAVING - filter results after grouping
SELECT COUNT(table.column1), table.column2 FROM table GROUP BY table.column2 HAVING COUNT(table.column1) > 'num'
HAVING
SELECT GROUP_CONCAT(biblioitems.biblionumber SEPARATOR ', ') AS 'Duplicate biblionumbers', biblioitems.isbn FROM biblioitems GROUP BY biblioitems.isbn HAVING COUNT(biblioitems.isbn) > 5
Return all biblionumbers where there are more than 5 duplicates
MARC
SELECT ExtractValue(table.metadata,'//datafield[@tag="tag"]/*'), ExtractValue(table.metadata,'//datafield[@tag="tag"]/subfield[@code="subfield"]'), ExtractValue(table.metadata,'//datafield[@tag="tag"]/subfield[@code>="subfield"]') FROM table
Table will always be biblio_metadata or auth_header
MARC
SELECT ExtractValue(biblio_metadata.metadata, '//datafield[@tag="245"]/*') AS 'Full Title information' FROM biblio_metadata WHERE biblio_metadata.biblionumber=14
SELECT ExtractValue(biblio_metadata.metadata, '//datafield[@tag="245"]/subfield[@code="a"]') AS 'Title' FROM biblio_metadata WHERE biblio_metadata.biblionumber=14
Linking tables
Creating SQL reports that use more than one Koha module
JOIN
SELECT a.column, b.column FROM table a JOIN table b USING (sameColumnName) WHERE a.column = "value"
SELECT a.column, b.column FROM table a JOIN table b ON (a.column = b.column) WHERE c.column = "value"
JOIN
SELECT b.biblionumber, b.title, i.barcode FROM biblio b JOIN items i USING (biblionumber) WHERE b.frameworkcode=<<Framework code>>
SELECT DISTINCT(b.title), i.itype, bi.itemtype FROM items i JOIN biblio b USING (biblionumber) JOIN biblioitems bi ON (i.itype=bi.itemtype)
Short break
SQL that sits behind the Statistics Wizards reports
Members wizard
Three SQL queries are run:
- Query 1: Populate row values
- Query 2: Populate column values
- Query 3: Populate cells with calculated row/column combinations
- This query includes filter values in a WHERE clause
Members wizard
Query 1
SELECT
DISTINCT borrowers.branchcode
FROM borrowers
WHERE borrowers.branchcode IS NOT NULL
ORDER BY borrowers.branchcode
Fetch distinct (unique) values for rows
Query 2
SELECT DISTINCT borrowers.categorycode
FROM borrowers
WHERE borrowers.categorycode IS NOT NULL
ORDER BY borrowers.categorycode
Fetch distinct values for columns
Query 3
SELECT borrowers.branchcode, borrowers.categorycode, COUNT(*)
FROM borrowers
WHERE borrowers.dateofbirth > '1990-01-01'
AND borrowers.dateofbirth < '2023-01-31'
GROUP BY borrowers.branchcode, borrowers.categorycode
Count by row/column combinations - include filters in a WHERE clause
Results
Catalogue wizard
Three SQL queries are run:
- Query 1: Populate row values
- Query 2: Populate column values
- Query 3: Populate cells with calculated row/column combinations
- Type of calculation can be configured from the 'Cell value' checkbox
- This query includes filter values in a WHERE clause
Catalogue wizard
Query 1
SELECT DISTINCTROW items.homebranch
FROM items
LEFT JOIN biblioitems USING (biblioitemnumber)
LEFT JOIN biblio ON (biblioitems.biblionumber = biblio.biblionumber)
WHERE 1
ORDER BY items.homebranch
Query 2
SELECT DISTINCTROW items.itype
FROM items
LEFT JOIN biblioitems
USING (biblioitemnumber)
LEFT JOIN biblio
ON (biblioitems.biblionumber = biblio.biblionumber)
WHERE 1 AND items.itype LIKE 'BK'
ORDER BY items.itype
Query 3
SELECT items.homebranch, items.itype, COUNT(*)
FROM items
LEFT JOIN biblioitems ON (items.biblioitemnumber = biblioitems.biblioitemnumber)
LEFT JOIN biblio ON (biblioitems.biblionumber = biblio.biblionumber)
WHERE 1 AND items.itype LIKE 'BK'
GROUP BY items.homebranch, items.itype
ORDER BY items.homebranch,items.itype
A change the cell value checkbox
Changes the COUNT() in query 3
SELECT items.homebranch, items.itype, COUNT(DISTINCT biblioitems.biblionumber)
FROM items
LEFT JOIN biblioitems ON (items.biblioitemnumber = biblioitems.biblioitemnumber)
LEFT JOIN biblio ON (biblioitems.biblionumber = biblio.biblionumber)
WHERE 1 AND items.itype LIKE ?
GROUP BY items.homebranch, items.itype
ORDER BY items.homebranch, items.itype
Creating SQL reports for your library
Creating SQL reports for your library
- Items without an itype (item type), home library or holding library set
- New titles this week
- Missing items
- Items in transit
Try these out:
Items without an itype (item type), home library or holding library set
SELECT items.biblionumber, items.itemnumber, items.itype, items.homebranch, items.holdingbranch FROM items WHERE items.itype IS NULL OR items.itype = '' OR items.homebranch IS NULL OR items.homebranch = '' OR items.holdingbranch IS NULL OR items.holdingbranch = '';
New titles this week
SELECT items.itemnumber, items.dateaccessioned FROM items WHERE items.dateaccessioned BETWEEN DATE_SUB( CURDATE(), INTERVAL 7 DAY ) AND CURDATE();
SELECT biblio.biblionumber, biblio.datecreated FROM biblio WHERE biblio.datecreated BETWEEN DATE_SUB( CURDATE(), INTERVAL 7 DAY ) AND CURDATE();
Missing items
SELECT items.itemnumber, items.datelastseen, items.itemlost_on FROM items WHERE items.itemlost IS NOT NULL;
Items in transit
SELECT * FROM branchtransfers WHERE branchtransfers.datesent IS NOT NULL AND branchtransfers.datearrived IS NULL AND branchtransfers.datecancelled IS NULL;
SELECT t.itemnumber, frb.branchname, tob.branchname FROM branchtransfers t LEFT JOIN branches frb ON (t.frombranch = frb.branchcode) LEFT JOIN branches tob ON (t.tobranch = tob.branchcode) WHERE t.datesent IS NOT NULL AND t.datearrived IS NULL AND t.datecancelled IS NULL;
Questions?
Thank you for coming!
Koha SQL training - Feb 2024
By alexbuckley
Koha SQL training - Feb 2024
- 205