Skip to main content

Ask Me Anything: Question Answering Systems

Hate to sift through piles of information just to find an answer to a simple question? QA Systems are just the tech for you!
Published on 15 February 2022
Text Size:


Ever ploughed through tons of information trying to find a simple answer to your question?

Wouldn’t it be useful to have a search engine that can extract answers from multiple sources, without us having to read through lengthy texts?

HTX’s Data Science and AI (DSAI) Centre of Expertise, in collaboration with Q-Team, works with Question Answering (QA) Systems to address this need.

A QA System is a language extractive search system that provides direct answers to questions posed in natural language, about groups of information (entities) found within texts.

Current Challenges With Entity Extraction

Traditional methods of entity extraction depend on pre-defined rules, which can be restrictive, as the model cannot be generalised to other use cases beyond these rules. For example, the entity ‘phone number’ is pre-defined according to the common format of 8 numerical digits, beginning with a “6”, “8”, or “9”.

Such traditional approaches of entity extraction thus face limitations.

Firstly, the list of entities is endless. Although well-known examples of ‘weapons’ include knives or guns, ‘weapons’ could also include water bottles or frying pans.

Secondly, entities can be ambiguous. The entity ‘location’ is usually an address, but could refer to “the grass patch beside the MRT station”.

Finally, most extraction techniques, being developed in a Western context, are not localised and cannot recognise Singaporean entities.

How do QA Systems overcome these challenges? QA systems employ pre-trained state-of-the-art deep learning models developed by large tech companies like Google or Facebook.

Such models are trained to detect the context of a particular word within the text, rather than being limited by matching keywords or formats.

For better answering accuracy, the models can be finetuned with custom data. This resolves the problems of ambiguity and localisation. Think of them as essentially achieving a similar outcome as a human reader scanning through the text and answering 5W1H (Who, What, When, Where, Why, How) questions on it!

How It Works

QA systems are used in both single-document and multi-document extraction. A single-document system simply extracts answers from only one text article. 


(Image Credit: HTX)


For multi-document extraction, HTX DSAI expanded the single-document system by leveraging Haystack – an open-source framework for building search systems.
 

(Image Credit: HTX, Haystack)

The Haystack process consists of two main parts.

Firstly, the Indexing Pipeline processes the texts into the correct format for analysis. Then, the Search Pipeline performs entity extraction. Within the Search Pipeline, the Retriever acts as a filter, filtering for relevant articles based on the question asked. The filtered articles are then passed to the Reader, which extracts the answer from them.

The applications of QA for entity extraction within the Home Team are limitless.

For one, officers can use QA systems to quickly find entities of interest (weapons, locations, people) in reports or news articles.

DSAI and Q Team are developing QA systems for officers to find information they need easily from lengthy government guidelines and documents, simplifying workflow and increasing efficiency.

The Way Forward

With QA systems, we can easily extract information from text without manual data annotation or traditional keyword searches that often yield incorrect answers.

QA systems are an important foundational building block for developing more complex tools for MHA’s unique operations!

 

Discover related articles

Digital Dictation
Digital Dictation
AI Summaries, In a Nutshell
AI Summaries, In a Nutshell
TechX Summit 2024 Special Edition: AI and Homeland Security
TechX Summit 2024 Special Edition: AI and Homeland Security
The Intelligent Eye that Sees It All
The Intelligent Eye that Sees It All