PhD Qualifying Exam
Saeid Balaneshinkordan
advisor:
Dr. Alexander Kotov
March 2016
Diagnosed by Blood Cancer / hour in US
Death by Blood Cancer / hour in US
definition of CDS systems
benefits of CDS systems
One of tasks a CDS system could be designed to;
Overcomes abundance of medical information;
1. | Diagnosis |
Description: A 26-year-old obese woman with a history of bipolar disorder complains that her recent struggles with her weight and eating have caused her to feel depressed. She states that she has recently had difficulty sleeping and feels excessively anxious and agitated. She also states that she has had thoughts of suicide. She often finds herself fidgety and unable to sit still for extended periods of time. Her family tells her that she is increasingly irritable. Her current medications include lithium carbonate and zolpidem. Summary: 26-year-old obese woman with bipolar disorder, on zolpidem and lithium, with recent difficulty sleeping, agitation, suicidal ideation, and irritability. |
Query Type
Query Description
Query Summary
relevant documents
an example of a medical query
general-purpose vs. domain-specific search engines
note: Medline: bibliographic database, concentrated on biomedicine, all of its records are indexed with U.S. National Library of Medicine (NLM®) Medical Subject Headings (MeSH®)
biomedical search engines
PubMed citations come from:
Medical Subject Headings (MeSH®) in MEDLINE®/PubMed®
MESH:
MEDLINE Subject Indexing:
purpose: facilitating search retrieval by eliminating the use of variant terminology for the same concept
Article Title:
The role of coenzyme Q10 in heart failure.
Abstract:
OBJECTIVE: To review the clinical data demonstrating the safety and efficacy of coenzyme Q10 (CoQ10) in heart failure (HF).
DATA SOURCES: Pertinent literature was identified through MEDLINE (1966-January 2005) using the search terms coenzyme Q10, heart failure, antioxidants, and oxidative stress. Only articles written in the English language and evaluating human subjects were used.
DATA SYNTHESIS: HF impairs the ability of the heart to maintain its normal cardiac output. Following an initial insult, cardiac remodeling ensues, resulting in left ventricular dilation and hypertrophy. Oxidative stress is also increased, while CoQ10 levels are decreased in patients with HF. This has led to the hypothesis that CoQ10, an antioxidant, may decrease oxidative stress, impair remodeling, and improve cardiac function.
CONCLUSIONS: Large, well-designed studies on this topic are lacking. The limited data from well-designed trials indicate there may be some minor benefits with CoQ10 therapy in ejection fraction and end diastolic volume. CoQ10 therapy has been shown to be relatively safe with a low incidence of adverse effects.
Publication Types:
Review
MeSH Terms:
Antioxidants/therapeutic use*
Coenzymes
Heart Failure/drug therapy*
Heart Failure/pathology
Heart Failure/physiopathology
Humans
Oxidative Stress/drug effects
Ubiquinone/analogs & derivatives*
Ubiquinone/therapeutic use
Ventricular Remodeling/drug effects
Substances:
Antioxidants
Coenzymes
Ubiquinone
coenzyme Q10
an example of MeSH indexing
MeSH indexing
("hematologic neoplasms"[MeSH Terms]
OR
(
"hematologic"[All Fields]
AND
"neoplasms"[All Fields]
)
OR
"hematologic neoplasms"[All Fields]
OR
(
"blood"[All Fields]
AND
"cancer"[All Fields]
)
OR
"blood cancer"[All Fields]
)
AND
(
"leukocytes"[MeSH Terms]
OR
"leukocytes"[All Fields]
OR
(
"white"[All Fields]
AND
"blood"[All Fields]
AND
"cells"[All Fields]
)
OR
"white blood cells"[All Fields]
OR
"leukocyte count"[MeSH Terms]
OR
( "leukocyte"[All Fields]
AND
"count"[All Fields]
)
OR
"leukocyte count"[All Fields]
OR
(
"white"[All Fields]
AND
"blood"[All Fields]
AND "cells"[All Fields]
)
)
Query Translation
by PubMed
blood cancer white blood cells
Information Need
an example of query translation by PubMed
Original Query
[Blood] Cancer patients who have neutropenia have a greater risk of infection.
Your risk increases when your white blood cell count gets low and stays low for a long time.
ref: http://www.chemotherapy.com/side_effects/neutropenia/
Boolean Operators: AND, OR, NOT can be used to combine query terms
Parentheses: can be used for nesting individual concepts
Asterisk (wildcard): can be used for truncation
Quotation Marks: can be used for phrase searching
Square Brackets: can be used to specify the search field tags
("hematologic neoplasms"[MeSH Terms]
OR
(
"hematologic"[All Fields]
AND
"neoplasms"[All Fields]
)
OR
"hematologic neoplasms"[All Fields]
OR
(
"blood"[All Fields]
AND
"cancer"[All Fields]
)
OR
"blood cancer"[All Fields]
)
AND
(
"leukocytes"[MeSH Terms]
OR
"leukocytes"[All Fields]
OR
(
"white"[All Fields]
AND
"blood"[All Fields]
AND
"cells"[All Fields]
)
OR
"white blood cells"[All Fields]
OR
"leukocyte count"[MeSH Terms]
OR
( "leukocyte"[All Fields]
AND
"count"[All Fields]
)
OR
"leukocyte count"[All Fields]
OR
(
"white"[All Fields]
AND
"blood"[All Fields]
AND "cells"[All Fields]
)
)
building query blocks in PubMed
Most Recent
Relevance
Publication Date
First Author
Last Author
Journal
Title
Sort by Relevance: sorts the retrieved documents based on term frequency of query terms and mesh terms in these documents
("hematologic neoplasms"[MeSH Terms]
OR
(
"hematologic"[All Fields]
AND
"neoplasms"[All Fields]
)
OR
"hematologic neoplasms"[All Fields]
OR
(
"blood"[All Fields]
AND
"cancer"[All Fields]
)
OR
"blood cancer"[All Fields]
)
AND
(
"leukocytes"[MeSH Terms]
OR
"leukocytes"[All Fields]
OR
(
"white"[All Fields]
AND
"blood"[All Fields]
AND
"cells"[All Fields]
)
OR
"white blood cells"[All Fields]
OR
"leukocyte count"[MeSH Terms]
OR
( "leukocyte"[All Fields]
AND
"count"[All Fields]
)
OR
"leukocyte count"[All Fields]
OR
(
"white"[All Fields]
AND
"blood"[All Fields]
AND "cells"[All Fields]
)
)
sorting results in PubMed
1- Does not consider importance of concepts
In the query "blood cancer white blood cells", the concept "blood cancer" is important than the concept "white blood cells", but no weighting is considered as PubMed is a performing Boolean search.
drawbacks of PubMed
2- Does not able to deal with verbose free-text queries
A 26-year-old obese woman with a history of bipolar disorder complains that her recent struggles with her weight and eating have caused her to feel depressed. She states that she has recently had difficulty sleeping and feels excessively anxious and agitated. She also states that she has had thoughts of suicide. She often finds herself fidgety and unable to sit still for extended periods of time. Her family tells her that she is increasingly irritable. Her current medications include lithium carbonate and zolpidem.
For a verbose free-text query such as:
PubMed returns no result even by removing stop-words
drawbacks of PubMed
3- Does not efficiently consider dependency between terms
PubMed considers exact phrase and individual terms of the concepts in query, but not proximity of them. In PubMed, exact phrase and terms of concepts are considered without weighting them.
4- Does not consider relevance feedback documents
Concepts only identified from the original query by using MESH knowledge base.
drawbacks of PubMed
5- Does not consider concept semantics
For example, a concept with semantic meaning "Sign or Symptom" is considered the same way as a concept with semantic meaning "Social Behavior".
6- Does not consider concepts relationships
In PubMed, MESH Concepts are identified in the query, but no level of relationship between these concepts and other concepts is considered.
Unified Medical Language System (UMLS):
semantic types "appropriate" for different medical tasks
ref: Limsopatham, Nut, Craig Macdonald, and Iadh Ounis. "Inferring conceptual relationships to improve medical records search." In Proceedings of the 10th Conference on Open Research Areas in Information Retrieval, pp. 1-8. , 2013.
Mapping biomedical text to the UMLS Metathesaurus
Mapping biomedical text to the UMLS Metathesaurus
Example
query: 26-year-old obese woman with bipolar disorder, on zolpidem and lithium, with recent difficulty sleeping, agitation, suicidal ideation, and irritability.
CUI | Concept Name | Concept Primary Name | Semantic Type |
---|---|---|---|
C0439508 | /year | per year | Temporal Concept |
C0580836 | Old | Old | Temporal Concept |
C0028754 | OBESE | Obesity | Disease or Syndrome |
C0043210 | WOMAN | Woman | Population Group |
C0005586 | Bipolar Disorder | Bipolar Disorder | Mental or Behavioral Dysfunction |
C0078839 | ZOLPIDEM | zolpidem | Organic Chemical,Pharmacologic Substance |
C0023870 | LITHIUM | Lithium | Element, Ion, or Isotope,Pharmacologic Substance |
C0332185 | Recent | Recent | Temporal Concept |
C0235162 | Difficulty sleeping | Difficulty sleeping | Sign or Symptom |
C0085631 | AGITATION | Agitation | Sign or Symptom |
C0424000 | Suicidal Ideation | Feeling suicidal (finding) | Finding |
C0022107 | IRRITABILITY | Irritable Mood | Finding |
MetaMap Results:
Concept Selection by using Semantic Types
Example
query: 26-year-old obese woman with bipolar disorder, on zolpidem and lithium, with recent difficulty sleeping, agitation, suicidal ideation, and irritability.
assume list of selected semantic types for this specific task:
Disease or Syndrome |
Mental or Behavioral Dysfunction |
Sign or Symptom |
CUI | Concept Name | Concept Primary Name | Semantic Type |
---|---|---|---|
C0028754 | OBESE | Obesity | Disease or Syndrome |
C0005586 | Bipolar Disorder | Bipolar Disorder | Mental or Behavioral Dysfunction |
C0235162 | Difficulty sleeping | Difficulty sleeping | Sign or Symptom |
C0085631 | AGITATION | Agitation | Sign or Symptom |
List of selected concept:
Concept-based Medical Retrieval Approach
Approach 1
Identify concepts from all the documents in the collection
Identify concepts from the queries
Using statistics of concepts in the documents in collection and in the query and find similarity of the documents to the query
sort documents in the collection
bag of words
bag of concepts
Concept-based Medical Retrieval Approach
Approach 1 - Example
ref: Wang, Chunye, and Ramakrishna Akella. "Concept-based relevance models for medical and semantic information retrieval." In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 173-182. ACM, 2015.
Concept-based Medical Retrieval Approach
Approach 2
Identify concepts from the queries
Use Sequential Dependence Model (SDM) model to get statistics of the query concepts in the collection
Use statistics of concepts in the documents in collection and in the query and find similarity of the documents to the query
sort documents in the collection
ref: Choi, Sungbin, Jinwook Choi, Sooyoung Yoo, Heechun Kim, and Youngho Lee. "Semantic concept-enriched dependence model for medical information retrieval." Journal of biomedical informatics 47 (2014): 18-27.
Indri Query Language:
Ranking function:
SDM assumption: dependency between adjacent query terms.
concept example: elderly patients ventilator associated pneumonia
weight parameters for single terms, ordered phrases and unordered phrases
Use Sequential Dependence Model (SDM) model to get statistics of the query concepts in the collection
Collection
External Sources
Top-ranked Documents
Knowledge-base
Relationship Tables
MetaMap
Direct Identification
Query
Concept Sources:
Identification Methods:
Concept Sources and Concept Identification
for Medical Query Expansion
Ranking function:
concept weight:
Matching function:
linear weighted combination of importance features
log of the language modeling estimate for concept κ with Dirichlet smoothing
linear weighted combination of matches in document D of all concepts types in T
Parameterized Concept Weighting
Concept importance features
Parameterized Concept Weighting
the probability of P being health-related over all the Wikipedia pages:
P: Wikipedia page corresponding to the concept
ref: Soldaini, Luca, Arman Cohan, Andrew Yates, Nazli Goharian, and Ophir Frieder. "Retrieving medical literature for clinical decision support." In Advances in Information Retrieval, pp. 538-549. Springer International Publishing, 2015.
General-purpose knowledge-bases
for query expansion
minimizing at one direction at a time
Coordinate Ascent
multivariable minimization optimization problem
univariate optimization problem
univariate optimization problem
univariate optimization problem
Optimization Techniques
for Medical query expansion
Why a domain-specific IR system is required for medical applications is discussed.
Traditional Medical IR systems are introduced.
PubMed as the most popular Medical IR system is discussed in detail.
Drawbacks of PubMed is presented.
Medical Knowledge-bases are introduced.
UMLS as the most comprehensive medical Knowledge-bases is discussed.
Finally, how UMLS can be used in medical IR system is discussed. Two corresponding approaches are discussed.
Conclusions
Thank You!