KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking

The Task

The Task

Who Composed the famous musical score for 1972 space-themed movie in which the character Luke Skywalker first appeared?

Joy Williams composed the score for Star Wars

Q
A

Hallucinated

Not Hallucinated

The Task : Existing Ways

Q
A
LLM
A

Probe Internal States

The Task : Existing Ways

Q
A
LLM
A

Probe Logits

The Task : Existing Ways

Q
A
LLM
A
A
A
A
A
A
A

See "Inconsistency" In answers

Self-Consistency Based Detection

KnowHalu: Motivation

Q
A
LLM
A

Discriminator

Who Composed the famous musical score for 1972 space-themed movie in which the character Luke Skywalker first appeared?

Q

The Discriminator Has no information about the world

KnowHalu: Motivation

Q
A
LLM
A

Discriminator

Who Composed the famous musical score for 1972 space-themed movie in which the character Luke Skywalker first appeared?

Q

"World Facts"

KnowHalu: Motivation

Q
A
LLM
A

Discriminator

"World Facts"

What if The answer is Factually correct but its Hallucinated?

KnowHalu: Motivation

Q
A
LLM
A

Discriminator

"World Facts"

What if The answer is Factually correct but its Hallucinated?

What is the Capital of France

France is in Europe

Q
A

KnowHalu: Motivation

Q
A
LLM
A

Discriminator

"World Facts"

What if The answer is Factually correct but its Hallucinated?

Non Fabrication
Detection

If Not Fabricated

If

Fabricated

KnowHalu: Motivation

Q
A
LLM
A

Discriminator

"World Facts"

What if The answer is Factually correct but its Hallucinated?

Non Fabrication
Detection

If Not Fabricated

If

Fabricated

Q?

KnowHalu: Motivation

Q
A
LLM
A

Discriminator

"World Facts"

What if The answer is Factually correct but its Hallucinated?

What is the Capital of France

France is in Europe

Q
A

Non-Fabrication
Hallucination

Fact
Checking

The Task : Existing Ways

Q
A
LLM
A
A
A
A
A
A
A

See "Inconsistency" In answers

Self-Consistency Based Detection

Non Fabrication Hallucination Checking

Non Fabrication Hallucination Checking

Non Fabrication Hallucination Checking

- Prompt the model to Extract Entities from the question in the answer. If no entities return NONE
 

- Yields better results than Vanila Prompting

Non Fabrication Hallucination Checking



You are now a very truthful and objective extractor in this task. Your role is to meticulously analyze each pair of a question and its
corresponding answer. Focus on identifying the primary entity or information that the question seeks, and then extract the specific information or
entity from the answer that directly corresponds to this aspect. Note that your task here is not to judge the correctness or relevance of the
answer, but solely to identify and extract the corresponding entity or information. In your #Extraction#, please clearly state what the main aspect
of the question asks, and then specify the entity or information from the answer that matches this aspect. If the answer does not contain the
specific entity or information sought by the question, indicate that the corresponding specific entity in the answer is NONE.

<FEW SHOT EXAMPLES>

In your #Extraction#, clearly identify the main focus of the question, and then pinpoint the specific entity or information in the answer that
corresponds to this focus. If the answer lacks the particular entity or information requested by the question, state that the relevant specific
entity in the answer is NONE. Notice, you DO NOT need to judge the corretness of the answer.
#Question#: {question}
#Answer#: {answer}
#Extraction#:

Fact Checking

Fact Checking

Who Composed the famous musical score for 1972 space-themed movie in which the character Luke Skywalker first appeared?

Joy Williams composed the score for Star Wars

Q
A

There are multiple facts we might want to consider :

1. Is Star wars released in 1972?
2. Is star wars a space themed movie?
3. Did joy William compose score for star wars?

4. Did Luke Skywalker appear for the first time in star wars? 

Fact Checking

Who Composed the famous musical score for 1972 space-themed movie in which the character Luke Skywalker first appeared?

Joy Williams composed the score for Star Wars

Q
A

There are multiple facts we might want to consider :

1. Is Star wars released in 1972?
2. Is star wars a space themed movie?
3. Did joy William compose score for star wars?

4. Did Luke Skywalker appear for the first time in star wars? 

Retrieving from short "one-hop queries" Is more performant than retrieving from broad "multi hop queries"

Fact Checking

Q
A

subqueries 

Fact Checking

Q
A

How are these queries formulated?

subqueries 

Who Composed the score for Star wars?

Did Joy Williams Compose the score for Star wars?

Fact Checking

Q
A

How are these queries formulated?

subqueries 

Who Composed the score for Star wars?

Did Joy Williams Compose the score for Star wars?

General Queries

Specific Queries

General Queries perform good when hallucinated
Specific quires perform good when not factually hallucinated

Fact Checking

Fact Checking

Q
A

subqueries 

Step wise reasoning and query generation

Since Its iterative, I.E, step i+1 depends on the knowledge gained from step i, This module interacts with the next module at every step

Step Wise Reasoning and Query Prompt

As a truthful and objective query specialist, your role is to craft precise queries for verifying the accuracy of provided answers. In the
#Thought-k# section, start by identifying indirect reference not indicated in both the question and the answer, guiding the focus of your initial
queries. Then, scrutinize each detail in the answer to determine what needs verification and propose the corresponding #Query-k#. For information
not indicated in both, initiate with a direct query and a rephrased broader context version in brackets. For details given in the answer, include
the claim in your query, such as "Did (entity from the answer) do (action/question's focus)?" and append a more general query without specifying
the key entity for a wider context in brackets. Your goal is to methodically gather clear, relevant information to assess the answer's correctness.
#Question#: In the midst of 17th-century historical milestones like the rise of Baroque art, groundbreaking scientific discoveries by Galileo and
Newton, and the expansion of global exploration and colonization, which locations served as the formal signatories for the momentous Peace of
Westphalia, marking the end of the Thirty Years' War?
#Answer#: Munster and Osnabruck, Germany, and it was signed in 1648.
#Thought-1#: The first query should confirm whether the Peace of Westphalia was indeed signed in Munster and Osnabruck, Germany, as provided by the
answer.
#Query-1#: Was the Peace of Westphalia signed in Munster and Osnabruck, Germany? [Where was the Peace of Westphalia signed?]
#Knowledge-1#: (Peace of Westphalia, signed in, Munster and Osnabruck, Germany)
#Thought-2#: Having confirmed the locations, the next step is to validate the year '1648' of the signing, as mentioned in the answer.
#Query-2#: Was the Peace of Westphalia signed in the year 1648? [When was the Peace of Westphalia signed?]
#Knowledge-2#: (Peace of Westphalia, signed in, October 1648)
#Thought-3#: All the necessary information to judge the correctness of the answer has been obtained, so the query process can now be concluded.

<FEW SHOT EXAMPLES>

Please ensure that all queries are direct, clear, and explicitly relate to the specific context provided in the question and answer. Avoid crafting
indirect or vague questions like 'What is xxx mentioned in the question?' Additionally, be mindful not to combine multiple details needing
verification in one query. Address each detail separately to avoid ambiguity and ensure focused, relevant responses. Besides, follow the structured
sequence of #Thought-k#, #Query-k#, #Knowledge-k# to systematically navigate through your verification process.
#Question#: {question}
#Answer#: {answer}

Knowledge Retrieval

Knowledge Retrieval

Q
A

Retriever

Knowledge Retrieval

Q
A

Retriever

Knowledge Retrieval

Q
A

Retriever

Knowledge Retrieval

Q
A

Retriever

Knowledge Retrieval

Q
A

Retriever

\vdots

REACT Style
Prompting

Knowledge Retrieval

Q
A

Retriever

\vdots

REACT Style
Prompting

Knowledge

Structured

Unstructured

Knowledge Optmization

Knowledge

Structured 

Unstructured

Structured

(Star wars, was, 1977 space-themed-movie)

Un-Structured

Starwars is the movie that was released in 1977 that is space themed

(object-predicate-object) triplets

"Multi-Form Knowledge"

Knowledge Optimization

Knowledge Optmization

As an objective responder, your primary role is to provide accurate answers in triplets form by extracting relevant information from available
knowledge sources, which are presented as article titles and summaries. Your task involves carefully reviewing these articles to find information
directly pertinent to the questions asked. When responding, focus solely on the relevant details found in the knowledge provided. If the provided
knowledge does not contain the necessary details to answer a question, respond with "No specific information is available."
#Query#: Was the Peace of Westphalia signed in Munster and Osnabruck, Germany? [Where was the Peace of Westphalia signed?]
#Knowledge#: Title: Peace of Westphalia. Article: The Peace of Westphalia (, ) is the collective name for two peace treaties signed in October 1648
in the Westphalian cities of Osnabruck and Munster. They ended the Thirty Years' War (1618âĂŞ1648) and brought peace to the Holy Roman Empire,
closing a calamitous period of European history that killed approximately eight million people. Holy Roman Emperor Ferdinand III, the kingdoms of
France and Sweden, and their respective allies among the princes of the Holy Roman Empire, participated in the treaties. The negotiation process
was lengthy and complex.
Title: Peace of Westphalia. Article: Talks took place in two cities, because each side wanted to meet on territory under its own control. A total
of 109 delegations arrived to represent the belligerent states, but not all delegations were present at the same time. Two treaties were signed to
end the war in the Empire: the Treaty of Munster and the Treaty of Osnabruck.
#Answer#: (Peace of Westphalia, signed in, Munster and Osnabruck, Germany)

<FEW SHOT EXAMPLES>


#Query#: {question}
#Knowledge#: {knowledge}
#Answer#:

Knowledge Optmization

As an objective responder, your primary role is to provide accurate answers by extracting relevant information from available knowledge sources,
which are presented as article titles and summaries. Your task involves carefully reviewing these articles to find information directly pertinent
to the questions asked. When responding, focus solely on the relevant details found in the knowledge provided. If the provided knowledge does not
contain the necessary details to answer a question, respond with "No specific information is available."
#Query#: Was the Peace of Westphalia signed in Munster and Osnabruck, Germany? [Where was the Peace of Westphalia signed?]
#Knowledge#: Title: Peace of Westphalia. Article: The Peace of Westphalia (, ) is the collective name for two peace treaties signed in October 1648
in the Westphalian cities of Osnabruck and Munster. They ended the Thirty Years' War (1618âĂŞ1648) and brought peace to the Holy Roman Empire,
closing a calamitous period of European history that killed approximately eight million people. Holy Roman Emperor Ferdinand III, the kingdoms of
France and Sweden, and their respective allies among the princes of the Holy Roman Empire, participated in the treaties. The negotiation process
was lengthy and complex.
Title: Peace of Westphalia. Article: Talks took place in two cities, because each side wanted to meet on territory under its own control. A total
of 109 delegations arrived to represent the belligerent states, but not all delegations were present at the same time. Two treaties were signed to
end the war in the Empire: the Treaty of Munster and the Treaty of Osnabruck.
#Answer#: Yes, the Peace of Westphalia was signed in Munster and Osnabruck, Germany.

<FEW SHOT EXAMPLES>

#Query#: {question}
#Knowledge#: {knowledge}
#Answer#:

Knowledge Retrieval

Retriever

ColBERT

PLAID

Judgement

Judgement

Q
A

Retriever

\vdots

REACT Style
Prompting

Judgement

\vdots

(Query, Knowledge) Pairs

LLMs

LLMs

CORRECT

IN-CORRECT

IF CONSISTENT

NOT CONSISTENT

IN CONCLUSIVE

Judgement

\vdots

(Query, Knowledge) Pairs

LLMs

LLMs

CORRECT

IN-CORRECT

IF CONSISTENT

NOT CONSISTENT

IN CONCLUSIVE

LLMs

LLMs

CORRECT

IN-CORRECT

IF CONSISTENT

NOT CONSISTENT

IN CONCLUSIVE

Structured

Un Structured

Judgement: Aggregation

\vdots

(Query, Knowledge) Pairs

LLMs

LLMs

CORRECT

IN-CORRECT

IF CONSISTENT

NOT CONSISTENT

IN CONCLUSIVE

LLMs

LLMs

CORRECT

IN-CORRECT

IF CONSISTENT

NOT CONSISTENT

IN CONCLUSIVE

Structured

Un Structured

Judgement: Aggregation

\vdots

(Query, Knowledge) Pairs

LLMs

LLMs

CORRECT

IN-CORRECT

IF CONSISTENT

NOT CONSISTENT

IN CONCLUSIVE

LLMs

LLMs

CORRECT

IN-CORRECT

IF CONSISTENT

NOT CONSISTENT

IN CONCLUSIVE

Structured

Un Structured

Base

Supplement

c_1
c_2
c_1 < \delta_1
c_2> \delta_2
\}

and

Use Supplement

Judgement: Aggregation

\vdots

(Query, Knowledge) Pairs

LLMs

LLMs

CORRECT

IN-CORRECT

IF CONSISTENT

NOT CONSISTENT

IN CONCLUSIVE

LLMs

LLMs

CORRECT

IN-CORRECT

IF CONSISTENT

NOT CONSISTENT

IN CONCLUSIVE

Structured

Un Structured

Base

Supplement

Overall Workflow

Overall Workflow

Experiments and Results

Experiments

  • Uses HaluEval Dataset.

  • Two Tasks: multi-hop QA and Text Summarization

  • 1000 QA points and 500 summary points

  • Metrics: TPR, TNR, Avg. Acc, Abstain Rates (Positive and Negative)

"Inconclusive" is possible

Results

Results

  • 7b Starling Comparable with zero-shot GPT-4
  • Form of Knowledge Matters: Starling is better with unstructured, GPT-3.5 better with structured
  • Both methods are better than GPT-4 baseline

Results

- Performs Better than baseline 

- GPT 3.5 < Starling 7B because Refusal to answer to certain summaries

- Observed "Lazy" Behavior

Thank you.

Lets Connect?

Lets Discuss !!

KnowHalu

By Incredeble us

KnowHalu

  • 29