Koha SQL training

Alex Buckley

Feb 2024

Agenda

  • Introductions
  • Writing SQL reports
    • Resources for writing SQL reports
    • SQL essentials
  • SQL that sits behind the Statistics wizard
  • Linking tables/creating SQL reports that use more than one Koha module
  • SQL reports for your library
  • Q & A

Introductions

  • Who you are
  • How long you have been using Koha
  • What you hope to get out of this workshop

Alex Buckley

  • Rōpū kohinga developer at Catalyst since 2015
  • Based in Nelson
  • Data migrations, new Koha and VuFind implementations, integrations, training

Test sites for today's training

  • User:staff
  • Password:staff1

(If you don't have a test site to use)

Writing SQL reports

Resources for writing SQL reports

Koha reports wiki

Mana Knowledge Base

Watch the video tutorial

Read the blog post

Resources for writing SQL reports

Koha Database Schema

  • Shows database tables and columns
  • Shows relationships between tables

 

  • Every table (set of data) has columns (attributes)
  • Each row in a table represents a record
  • Sometimes tables are linked by a shared column (key)

Writing SQL reports

Important

  • Always run your report on your test site before trying it in production.
  • Do not query the action_logs table, use the Log Viewer Tool instead.
  • Do not refresh the page while running a report, and do not press Run more than once.
  • Reports that are too vague may time out or overload the server.
  • The Koha database structure may change between versions. Old reports may not work in recent Koha versions, and vice versa.
  • You can only view (not modify) data using Koha reports.

Useful tables

  • borrowers, deletedborrowers - patron/member data
  • items, deleteditems - holdings data
  • biblio, deletedbiblio - bibliographic data
  • auth_header - authority data
  • reserves, old_reserves - reserves/holds data
  • issues, old_issues - circulation data
  • branchtransfers - transfer data
  • accountlines - fines and fees data
  • statistics
  • serial - specific serial issue data
  • subscription

SQL essentials

SELECT query

SELECT

SELECT table.column
FROM table
SELECT count(*) 
FROM table
SELECT table.column1, table.column2, table.column3
FROM table
SELECT table.* 
FROM table

SELECT

SELECT biblio.biblionumber
FROM biblio
SELECT count(*) 
FROM items
SELECT borrowers.firstname, borrowers.surname, borrowers.email
FROM borrowers

WHERE

SELECT count(*) 
FROM table 
WHERE table.column = "value"
SELECT table.column1, table.column 2
FROM table
WHERE table.column IS NULL
SELECT table.column1
FROM table
WHERE table.column != "value"

WHERE

SELECT count(*) 
FROM statistics 
WHERE statistics.type = "issue" LIMIT 10;
SELECT count(*) 
FROM items
WHERE items.barcode IS NULL
SELECT count(*) 
FROM reserves
WHERE reserves.itemnumber IS NOT NULL

AS

SELECT table.column AS "Column name" FROM table

SELECT table.column AS ColumnName FROM table
SELECT borrowers.firstname AS "First name" FROM borrowers

SELECT borrowers.surname AS Surname FROM borrowers

LIKE with wildcards

SELECT table.column FROM table
WHERE table.column LIKE "%middle%";
SELECT table.column FROM table
WHERE table.column NOT LIKE "%value%";
SELECT table.column FROM table
WHERE table.column LIKE "%end";
SELECT table.column FROM table
WHERE table.column LIKE "start%";

LIKE with wildcards

SELECT borrowers.surname AS Surname, 
borrowers.firstname AS "First Name", 
borrowers.cardnumber AS "Card Number", 
borrowers.email AS Email
FROM borrowers
WHERE borrowers.email NOT LIKE '%@%.%'
SELECT biblio.title FROM biblio
WHERE biblio.title LIKE "%dog%";

DISTINCT

SELECT DISTINCT(table.column) 
FROM table

DISTINCT

SELECT DISTINCT(items.itype) 
FROM items

WHERE with AND

SELECT table.column1, table.column2
FROM table
WHERE table.column = 'data'
AND table.column != 'data'
AND table.column >= 'number'

WHERE with AND

SELECT items.itemnumber, items.barcode 
FROM items 
WHERE items.homebranch = 'CPL'
AND items.ccode = 'FIC'
SELECT items.* FROM items
WHERE items.ccode = 'FIC'
AND items.dateaccessioned > '2021';

WHERE with OR

SELECT table.column1, table.column2 
FROM table
WHERE table.column = "data" 
OR table.column IS NOT NULL

WHERE with OR

SELECT items.biblionumber, items.itemnumber 
FROM items
WHERE items.itemlost = 1
OR items.damaged = 1
OR items.withdrawn = 1
OR items.notforloan = 1

RUNTIME PARAMETERS

SELECT table.column
FROM table
WHERE table.column = <<Label for text input>>
SELECT table.column
FROM table
WHERE table.column = <<Label for dropdown|dropdown>>

RUNTIME PARAMETERS

RUNTIME PARAMETERS

SELECT biblio.biblionumber, biblio.title
FROM biblio 
WHERE biblio.frameworkcode = <<Framework code|biblio_framework>>
SELECT items.itemnumber, items.barcode
FROM items
WHERE items.ccode = <<Collection|CCODE>>

ORDER BY

SELECT table.column
FROM table
WHERE table.column = "data"
ORDER BY table.column ASC
SELECT table.column
FROM table
WHERE table.column = "data"
ORDER BY table.column DESC

ORDER BY

SELECT borrowers.surname, borrowers.firstname 
FROM borrowers
WHERE borrowers.branchcode = "CPL"
ORDER BY borrowers.surname ASC
SELECT borrowers.surname, borrowers.firstname 
FROM borrowers
WHERE borrowers.branchcode = "CPL"
ORDER BY borrowers.firstname DESC

LIMIT

SELECT table.column
FROM table
LIMIT num_of_rows

LIMIT

SELECT biblio.*
FROM biblio 
WHERE biblio.datecreated > '2020'
LIMIT 5;
SELECT biblio.*
FROM biblio 
WHERE biblio.datecreated > '2020'
ORDER BY biblio.title ASC
LIMIT 5;

GROUP BY

SELECT COUNT(table.column), table.column 
FROM table 
GROUP BY table.column
SELECT SUM(table.column), table.column 
FROM table 
GROUP BY table.column

GROUP BY

SELECT COUNT(items.itemcallnumber) AS 'Number of Items', items.itemcallnumber 
FROM items 
GROUP BY items.itemcallnumber 

HAVING

HAVING - filter results after grouping

SELECT COUNT(table.column1), table.column2
FROM table
GROUP BY table.column2
HAVING COUNT(table.column1) > 'num'

HAVING

SELECT GROUP_CONCAT(biblioitems.biblionumber SEPARATOR ', ') AS 'Duplicate biblionumbers', biblioitems.isbn 
FROM biblioitems 
GROUP BY biblioitems.isbn 
HAVING COUNT(biblioitems.isbn) > 5

Return all biblionumbers where there are more than 5 duplicates

MARC

SELECT 

ExtractValue(table.metadata,'//datafield[@tag="tag"]/*'),

ExtractValue(table.metadata,'//datafield[@tag="tag"]/subfield[@code="subfield"]'),

ExtractValue(table.metadata,'//datafield[@tag="tag"]/subfield[@code>="subfield"]')

FROM table

Table will always be biblio_metadata or auth_header

MARC

SELECT ExtractValue(biblio_metadata.metadata,
'//datafield[@tag="245"]/*') AS 'Full Title information' 
FROM biblio_metadata 
WHERE biblio_metadata.biblionumber=14
SELECT ExtractValue(biblio_metadata.metadata,
'//datafield[@tag="245"]/subfield[@code="a"]') AS 'Title' 
FROM biblio_metadata
WHERE biblio_metadata.biblionumber=14

Linking tables

Creating SQL reports that use more than one Koha module

JOIN

SELECT a.column, b.column
FROM table a
JOIN table b USING (sameColumnName)
WHERE a.column = "value"
SELECT a.column, b.column
FROM table a
JOIN table b ON (a.column = b.column)
WHERE c.column = "value"

JOIN

SELECT b.biblionumber, b.title, i.barcode 
FROM biblio b
JOIN items i USING (biblionumber)
WHERE b.frameworkcode=<<Framework code>>
SELECT DISTINCT(b.title), i.itype, bi.itemtype 
FROM items i 
JOIN biblio b USING (biblionumber) 
JOIN biblioitems bi ON (i.itype=bi.itemtype)

Short break

SQL that sits behind the Statistics Wizards reports

Members wizard

Three SQL queries are run:

  • Query 1: Populate row values
  • Query 2: Populate column values
  • Query 3: Populate cells with calculated row/column combinations
    • This query includes filter values in a WHERE clause

 

 

Members wizard

Query 1

SELECT
DISTINCT borrowers.branchcode
FROM borrowers
WHERE borrowers.branchcode IS NOT NULL

ORDER BY borrowers.branchcode

Fetch distinct (unique) values for rows

Query 2

SELECT DISTINCT borrowers.categorycode
FROM borrowers
WHERE borrowers.categorycode IS NOT NULL
ORDER BY borrowers.categorycode

Fetch distinct values for columns

Query 3

SELECT borrowers.branchcode, borrowers.categorycode, COUNT(*)
FROM borrowers  

WHERE borrowers.dateofbirth > '1990-01-01'
AND borrowers.dateofbirth < '2023-01-31'
GROUP BY borrowers.branchcode, borrowers.categorycode

Count by row/column combinations - include filters in a WHERE clause

Results

Catalogue wizard

Three SQL queries are run:

  • Query 1: Populate row values
  • Query 2: Populate column values
  • Query 3: Populate cells with calculated row/column combinations
    • Type of calculation can be configured from the 'Cell value' checkbox
    • This query includes filter values in a WHERE clause

 

 

Catalogue wizard

Query 1

SELECT DISTINCTROW items.homebranch
FROM items
LEFT JOIN biblioitems USING (biblioitemnumber
)

LEFT JOIN biblio ON (biblioitems.biblionumber = biblio.biblionumber)

WHERE 1
ORDER BY items.homebranch

Query 2

SELECT DISTINCTROW items.itype
FROM items
LEFT JOIN biblioitems
USING (biblioitemnumber)

LEFT JOIN biblio
ON (biblioitems.biblionumber = biblio.biblionumber)
WHERE 1  AND items.itype LIKE 'BK'  
ORDER BY items.itype

Query 3

SELECT items.homebranch, items.itype,  COUNT(*)

FROM items
LEFT JOIN biblioitems ON (items.biblioitemnumber = biblioitems.biblioitemnumber)

LEFT JOIN biblio ON (biblioitems.biblionumber = biblio.biblionumber)
WHERE 1  AND items.itype LIKE 'BK'

GROUP BY items.homebranch, items.itype
ORDER BY items.homebranch,items.itype

A change the cell value checkbox

Changes the COUNT() in query 3

SELECT items.homebranch, items.itype,  COUNT(DISTINCT biblioitems.biblionumber)

FROM items
LEFT JOIN biblioitems ON (items.biblioitemnumber = biblioitems.biblioitemnumber)
LEFT JOIN biblio ON (biblioitems.biblionumber = biblio.biblionumber)
WHERE 1  AND items.itype LIKE ?

GROUP BY items.homebranch, items.itype
ORDER BY items.homebranch, items.itype

Creating SQL reports for your library

Creating SQL reports for your library

  • Items without an itype (item type), home library or holding library set
  • New titles this week
  • Missing items
  • Items in transit

Try these out:

Items without an itype (item type), home library or holding library set

SELECT items.biblionumber, items.itemnumber, items.itype, items.homebranch, items.holdingbranch
FROM items
WHERE items.itype IS NULL
OR items.itype = ''
OR items.homebranch IS NULL
OR items.homebranch = ''
OR items.holdingbranch IS NULL
OR items.holdingbranch = '';

New titles this week

SELECT items.itemnumber, items.dateaccessioned
FROM items
WHERE items.dateaccessioned
BETWEEN DATE_SUB( CURDATE(), INTERVAL 7 DAY ) AND CURDATE();
SELECT biblio.biblionumber, biblio.datecreated
FROM biblio
WHERE biblio.datecreated
BETWEEN DATE_SUB( CURDATE(), INTERVAL 7 DAY ) AND CURDATE();

Missing items

SELECT items.itemnumber, items.datelastseen, items.itemlost_on
FROM items
WHERE items.itemlost IS NOT NULL;

Items in transit

SELECT *
FROM branchtransfers
WHERE branchtransfers.datesent IS NOT NULL
AND branchtransfers.datearrived IS NULL
AND branchtransfers.datecancelled IS NULL;
SELECT t.itemnumber, frb.branchname, tob.branchname
FROM branchtransfers t
LEFT JOIN branches frb ON (t.frombranch = frb.branchcode)
LEFT JOIN branches tob ON (t.tobranch = tob.branchcode)
WHERE t.datesent IS NOT NULL AND t.datearrived IS NULL AND t.datecancelled IS NULL;

Questions?

Thank you for coming!

Made with Slides.com