HTX_websitebanners_2021oct_3

AI Summaries, In a Nutshell

by Ong Pang Wei - Data Science and Artificial Intelligence CoE

ATS cover
In our efforts to advance the Home Team’s Science and Technology, it is crucial for us to keep up to speed on the latest and greatest developments for wise decision-making.

However, in this era of Big Data, huge volumes of noise and data are being generated every second. How can time-pressed officers cut through the chaos and make sense of it all?

Well, with Automatic Text Summarisation (ATS), we could train computers to sift through the information and summarise its key points with the push of a button!

ATS is part of Natural Language Processing (NLP), a field in Artificial Intelligence (AI) that seeks to help computers understand human language. We interact with NLP all the time - when our phones autocorrect our typos, when we use a translation app, or even when we ask Google a question.

ATS is a higher-level application of NLP, where a computer “reads” a large set of text and shortens it into its key messages. Having computers automate the summarisation of text can help to reduce time spent on research, allow for more consistent filtering of key information, and can even rival human summaries in its quality.

But before a computer can summarise a text well, it first needs to know what it means.

Distilling Meaning

HTX’s Data Science and AI (DSAI) Centre of Expertise leverages Google’s open-source, neural network-based Bidirectional Encoder Representations from Transformers (BERT)1 model.

Trained on a massive amount of data, BERT can “understand” language so well that it can determine the importance of certain words in their context. For example, it can differentiate whether the word ‘bat’ meant the animal or the sports equipment within the sentence.

Using BERT’s methodologies as a base, DSAI is building better ATS models that can pick out the core messages from a piece of text to produce higher quality summaries.

Above and Beyond

Currently, most ATS programs are only able to generate one summary from one article. However, to make sense of a topic given the deluge of information in our big data era, we need a way to synthesise one topical summary from many sources.

Hence, DSAI explored the generation of one summary from multiple articles through the innovative use of network analysis techniques. 

ATS

This technique first combines all the information from multiple sources into one document. Each sentence will be analysed, and the important information will be extracted to construct a final document for summarisation. (Image credit: HTX)

These summaries are incredibly useful in many domains - one of which is condensing news for rapid sensemaking. Today, news summarisation within the Home Team is a largely manual process done by officers. If we could automate this process with ATS, we can complement their efforts by processing a greater volume of articles at a fraction of the time.

In the near future, Home Team officers can use HTX’s ATS to automatically distil key information from various sources, allowing for rapid sensemaking that supports decision-making to protect Singapore’s safety and security.

 

 

1Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805v2.