KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking
The Task
The Task
Who Composed the famous musical score for 1972 space-themed movie in which the character Luke Skywalker first appeared?
Joy Williams composed the score for Star Wars
Hallucinated
Not Hallucinated
The Task : Existing Ways
Probe Internal States
The Task : Existing Ways
Probe Logits
The Task : Existing Ways
See "Inconsistency" In answers
Self-Consistency Based Detection
KnowHalu: Motivation
Discriminator
Who Composed the famous musical score for 1972 space-themed movie in which the character Luke Skywalker first appeared?
The Discriminator Has no information about the world
KnowHalu: Motivation
Discriminator
Who Composed the famous musical score for 1972 space-themed movie in which the character Luke Skywalker first appeared?
"World Facts"
KnowHalu: Motivation
Discriminator
"World Facts"
What if The answer is Factually correct but its Hallucinated?
KnowHalu: Motivation
Discriminator
"World Facts"
What if The answer is Factually correct but its Hallucinated?
What is the Capital of France
France is in Europe
KnowHalu: Motivation
Discriminator
"World Facts"
What if The answer is Factually correct but its Hallucinated?
Non Fabrication
Detection
If Not Fabricated
If
Fabricated
KnowHalu: Motivation
Discriminator
"World Facts"
What if The answer is Factually correct but its Hallucinated?
Non Fabrication
Detection
If Not Fabricated
If
Fabricated
KnowHalu: Motivation
Discriminator
"World Facts"
What if The answer is Factually correct but its Hallucinated?
What is the Capital of France
France is in Europe
Non-Fabrication
Hallucination
Fact
Checking
The Task : Existing Ways
See "Inconsistency" In answers
Self-Consistency Based Detection
Non Fabrication Hallucination Checking
Non Fabrication Hallucination Checking
Non Fabrication Hallucination Checking
- Prompt the model to Extract Entities from the question in the answer. If no entities return NONE
- Yields better results than Vanila Prompting
Non Fabrication Hallucination Checking
You are now a very truthful and objective extractor in this task. Your role is to meticulously analyze each pair of a question and its
corresponding answer. Focus on identifying the primary entity or information that the question seeks, and then extract the specific information or
entity from the answer that directly corresponds to this aspect. Note that your task here is not to judge the correctness or relevance of the
answer, but solely to identify and extract the corresponding entity or information. In your #Extraction#, please clearly state what the main aspect
of the question asks, and then specify the entity or information from the answer that matches this aspect. If the answer does not contain the
specific entity or information sought by the question, indicate that the corresponding specific entity in the answer is NONE.
<FEW SHOT EXAMPLES>
In your #Extraction#, clearly identify the main focus of the question, and then pinpoint the specific entity or information in the answer that
corresponds to this focus. If the answer lacks the particular entity or information requested by the question, state that the relevant specific
entity in the answer is NONE. Notice, you DO NOT need to judge the corretness of the answer.
#Question#: {question}
#Answer#: {answer}
#Extraction#:
Fact Checking
Fact Checking
Who Composed the famous musical score for 1972 space-themed movie in which the character Luke Skywalker first appeared?
Joy Williams composed the score for Star Wars
There are multiple facts we might want to consider :
1. Is Star wars released in 1972?
2. Is star wars a space themed movie?
3. Did joy William compose score for star wars?
4. Did Luke Skywalker appear for the first time in star wars?
Fact Checking
Who Composed the famous musical score for 1972 space-themed movie in which the character Luke Skywalker first appeared?
Joy Williams composed the score for Star Wars
There are multiple facts we might want to consider :
1. Is Star wars released in 1972?
2. Is star wars a space themed movie?
3. Did joy William compose score for star wars?
4. Did Luke Skywalker appear for the first time in star wars?
Retrieving from short "one-hop queries" Is more performant than retrieving from broad "multi hop queries"
Fact Checking
subqueries
Fact Checking
How are these queries formulated?
subqueries
Who Composed the score for Star wars?
Did Joy Williams Compose the score for Star wars?
Fact Checking
How are these queries formulated?
subqueries
Who Composed the score for Star wars?
Did Joy Williams Compose the score for Star wars?
General Queries
Specific Queries
General Queries perform good when hallucinated
Specific quires perform good when not factually hallucinated
Fact Checking
Fact Checking
subqueries
Step wise reasoning and query generation
Since Its iterative, I.E, step i+1 depends on the knowledge gained from step i, This module interacts with the next module at every step
Step Wise Reasoning and Query Prompt
As a truthful and objective query specialist, your role is to craft precise queries for verifying the accuracy of provided answers. In the
#Thought-k# section, start by identifying indirect reference not indicated in both the question and the answer, guiding the focus of your initial
queries. Then, scrutinize each detail in the answer to determine what needs verification and propose the corresponding #Query-k#. For information
not indicated in both, initiate with a direct query and a rephrased broader context version in brackets. For details given in the answer, include
the claim in your query, such as "Did (entity from the answer) do (action/question's focus)?" and append a more general query without specifying
the key entity for a wider context in brackets. Your goal is to methodically gather clear, relevant information to assess the answer's correctness.
#Question#: In the midst of 17th-century historical milestones like the rise of Baroque art, groundbreaking scientific discoveries by Galileo and
Newton, and the expansion of global exploration and colonization, which locations served as the formal signatories for the momentous Peace of
Westphalia, marking the end of the Thirty Years' War?
#Answer#: Munster and Osnabruck, Germany, and it was signed in 1648.
#Thought-1#: The first query should confirm whether the Peace of Westphalia was indeed signed in Munster and Osnabruck, Germany, as provided by the
answer.
#Query-1#: Was the Peace of Westphalia signed in Munster and Osnabruck, Germany? [Where was the Peace of Westphalia signed?]
#Knowledge-1#: (Peace of Westphalia, signed in, Munster and Osnabruck, Germany)
#Thought-2#: Having confirmed the locations, the next step is to validate the year '1648' of the signing, as mentioned in the answer.
#Query-2#: Was the Peace of Westphalia signed in the year 1648? [When was the Peace of Westphalia signed?]
#Knowledge-2#: (Peace of Westphalia, signed in, October 1648)
#Thought-3#: All the necessary information to judge the correctness of the answer has been obtained, so the query process can now be concluded.
<FEW SHOT EXAMPLES>
Please ensure that all queries are direct, clear, and explicitly relate to the specific context provided in the question and answer. Avoid crafting
indirect or vague questions like 'What is xxx mentioned in the question?' Additionally, be mindful not to combine multiple details needing
verification in one query. Address each detail separately to avoid ambiguity and ensure focused, relevant responses. Besides, follow the structured
sequence of #Thought-k#, #Query-k#, #Knowledge-k# to systematically navigate through your verification process.
#Question#: {question}
#Answer#: {answer}
Knowledge Retrieval
Knowledge Retrieval
Retriever
Knowledge Retrieval
Retriever
Knowledge Retrieval
Retriever
Knowledge Retrieval
Retriever
Knowledge Retrieval
Retriever
REACT Style
Prompting
Knowledge Retrieval
Retriever
REACT Style
Prompting
Knowledge
Structured
Unstructured
Knowledge Optmization
Knowledge
Structured
Unstructured
Structured
(Star wars, was, 1977 space-themed-movie)
Un-Structured
Starwars is the movie that was released in 1977 that is space themed
(object-predicate-object) triplets
"Multi-Form Knowledge"
Knowledge Optimization
Knowledge Optmization
As an objective responder, your primary role is to provide accurate answers in triplets form by extracting relevant information from available
knowledge sources, which are presented as article titles and summaries. Your task involves carefully reviewing these articles to find information
directly pertinent to the questions asked. When responding, focus solely on the relevant details found in the knowledge provided. If the provided
knowledge does not contain the necessary details to answer a question, respond with "No specific information is available."
#Query#: Was the Peace of Westphalia signed in Munster and Osnabruck, Germany? [Where was the Peace of Westphalia signed?]
#Knowledge#: Title: Peace of Westphalia. Article: The Peace of Westphalia (, ) is the collective name for two peace treaties signed in October 1648
in the Westphalian cities of Osnabruck and Munster. They ended the Thirty Years' War (1618âĂŞ1648) and brought peace to the Holy Roman Empire,
closing a calamitous period of European history that killed approximately eight million people. Holy Roman Emperor Ferdinand III, the kingdoms of
France and Sweden, and their respective allies among the princes of the Holy Roman Empire, participated in the treaties. The negotiation process
was lengthy and complex.
Title: Peace of Westphalia. Article: Talks took place in two cities, because each side wanted to meet on territory under its own control. A total
of 109 delegations arrived to represent the belligerent states, but not all delegations were present at the same time. Two treaties were signed to
end the war in the Empire: the Treaty of Munster and the Treaty of Osnabruck.
#Answer#: (Peace of Westphalia, signed in, Munster and Osnabruck, Germany)
<FEW SHOT EXAMPLES>
#Query#: {question}
#Knowledge#: {knowledge}
#Answer#:
Knowledge Optmization
As an objective responder, your primary role is to provide accurate answers by extracting relevant information from available knowledge sources,
which are presented as article titles and summaries. Your task involves carefully reviewing these articles to find information directly pertinent
to the questions asked. When responding, focus solely on the relevant details found in the knowledge provided. If the provided knowledge does not
contain the necessary details to answer a question, respond with "No specific information is available."
#Query#: Was the Peace of Westphalia signed in Munster and Osnabruck, Germany? [Where was the Peace of Westphalia signed?]
#Knowledge#: Title: Peace of Westphalia. Article: The Peace of Westphalia (, ) is the collective name for two peace treaties signed in October 1648
in the Westphalian cities of Osnabruck and Munster. They ended the Thirty Years' War (1618âĂŞ1648) and brought peace to the Holy Roman Empire,
closing a calamitous period of European history that killed approximately eight million people. Holy Roman Emperor Ferdinand III, the kingdoms of
France and Sweden, and their respective allies among the princes of the Holy Roman Empire, participated in the treaties. The negotiation process
was lengthy and complex.
Title: Peace of Westphalia. Article: Talks took place in two cities, because each side wanted to meet on territory under its own control. A total
of 109 delegations arrived to represent the belligerent states, but not all delegations were present at the same time. Two treaties were signed to
end the war in the Empire: the Treaty of Munster and the Treaty of Osnabruck.
#Answer#: Yes, the Peace of Westphalia was signed in Munster and Osnabruck, Germany.
<FEW SHOT EXAMPLES>
#Query#: {question}
#Knowledge#: {knowledge}
#Answer#:
Knowledge Retrieval
Retriever
ColBERT
PLAID
Judgement
Judgement
Retriever
REACT Style
Prompting
Judgement
(Query, Knowledge) Pairs
LLMs
LLMs
CORRECT
IN-CORRECT
IF CONSISTENT
NOT CONSISTENT
IN CONCLUSIVE
Judgement
(Query, Knowledge) Pairs
LLMs
LLMs
CORRECT
IN-CORRECT
IF CONSISTENT
NOT CONSISTENT
IN CONCLUSIVE
LLMs
LLMs
CORRECT
IN-CORRECT
IF CONSISTENT
NOT CONSISTENT
IN CONCLUSIVE
Structured
Un Structured
Judgement: Aggregation
(Query, Knowledge) Pairs
LLMs
LLMs
CORRECT
IN-CORRECT
IF CONSISTENT
NOT CONSISTENT
IN CONCLUSIVE
LLMs
LLMs
CORRECT
IN-CORRECT
IF CONSISTENT
NOT CONSISTENT
IN CONCLUSIVE
Structured
Un Structured
Judgement: Aggregation
(Query, Knowledge) Pairs
LLMs
LLMs
CORRECT
IN-CORRECT
IF CONSISTENT
NOT CONSISTENT
IN CONCLUSIVE
LLMs
LLMs
CORRECT
IN-CORRECT
IF CONSISTENT
NOT CONSISTENT
IN CONCLUSIVE
Structured
Un Structured
Base
Supplement
and
Use Supplement
Judgement: Aggregation
(Query, Knowledge) Pairs
LLMs
LLMs
CORRECT
IN-CORRECT
IF CONSISTENT
NOT CONSISTENT
IN CONCLUSIVE
LLMs
LLMs
CORRECT
IN-CORRECT
IF CONSISTENT
NOT CONSISTENT
IN CONCLUSIVE
Structured
Un Structured
Base
Supplement
Overall Workflow
Overall Workflow
Experiments and Results
Experiments
-
Uses HaluEval Dataset.
-
Two Tasks: multi-hop QA and Text Summarization
-
1000 QA points and 500 summary points
-
Metrics: TPR, TNR, Avg. Acc, Abstain Rates (Positive and Negative)
"Inconclusive" is possible
Results
Results
- 7b Starling Comparable with zero-shot GPT-4
- Form of Knowledge Matters: Starling is better with unstructured, GPT-3.5 better with structured
- Both methods are better than GPT-4 baseline
Results
- Performs Better than baseline
- GPT 3.5 < Starling 7B because Refusal to answer to certain summaries
- Observed "Lazy" Behavior
Thank you.
Lets Connect?
Lets Discuss !!
KnowHalu
By Incredeble us
KnowHalu
- 29