Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch, Keras, and TensorFlow

Last updated October 7, 2021
In AI Mysteries

Share

Published onNovember 24, 2020

byAnkit Das

Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch, Keras, and TensorFlow – AIM (2)

Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch, Keras, and TensorFlow – AIM (3)

Question Answering is a technique inside the fields of natural language processing, which is concerned about building frameworks that consequently answer addresses presented by people in natural language processing. The capacity to peruse the content and afterward answer inquiries concerning it, is a difficult undertaking for machines, requiring information about the world. Existing datasets for Question answering have two main weaknesses: those that are used in training data are excessively little for preparing present-day information, while those that are enormous don’t have similar attributes as express perusing comprehension questions.

Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch, Keras, and TensorFlow – AIM (4)

To address the need for large and high-quality Question answering datasets, we will discuss some of the popular datasets and their code implementation using TensorFlow and Pytorch. Further, we will discuss some of the benchmark models that gave high accuracy on these datasets.

SQuAD

Stanford Question Answering Dataset (SQuAD) is a dataset comprising 100,000+ inquiries presented by crowd workers on a bunch of Wikipedia articles, where the response to each address is a fragment of text from the comparing understanding entry. The dataset was presented by researchers: Pranav Rajpurkar and Jian Zhang and Konstantin Lopyrev and Percy Liang from Stanford University.

Loading the dataset using PyTorch

import jsonfrom torchnlp.download import download_file_maybe_extractdef squad_dataset(directory='data/',train=False,dev=False,train_filename='train-v2.0.json',dev_filename='dev-v2.0.json',check_files_train=['train-v2.0.json'],check_files_dev=['dev-v2.0.json'],url_train='https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json',url_dev='https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json'):download_file_maybe_extract(url=url_dev, directory=directory, check_files=check_files_dev)download_file_maybe_extract(url=url_train, directory=directory, check_files=check_files_train)squad= []splits_text = [(train, train_filename), (dev, dev_filename)]splits_text = [f for (requested, f) in splits_text if requested]for filename in splits_text :full_path = os.path.join(directory, filename)with open(full_path, 'r') as temp:ret.append(json.load(temp)['data'])if len(squad) == 1:return ret[0]else:return tuple(squad)

Loading the dataset using TensorFlow

import tensorflow as tfdef squad(path):data = tf.data.TextLineDataset(path)def content_filter(source):return tf.logical_not(tf.strings.regex_full_match(source,'([[:space:]][=])+.+([[:space:]][=])+[[:space:]]*'))data = data.filter(content_filter)data = data.map(lambda x: tf.strings.split(x, ' . '))data = data.unbatch()return datatrain= squad('https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json')

State of the Art

The current state of the art on SQuAD dataset is SA-Net on Albert. The model gave an F1 score of 93.011.

Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch, Keras, and TensorFlow – AIM (5)

Source

bAbI

The bAbI-Question Answering is a dataset for question noting and text understanding. The dataset is made out of a bunch of contexts, with numerous inquiry answer sets accessible depending on the specific situations. It contains both English and Hindi content. The “ContentElements” field contains training data and testing data. The initial two give admittance to information designed to normal preparing errands. They are retrieved from the 10,000k variant in English.bAbI was presented by Facebook Group.

Loading the dataset using PyTorch

import osfrom io import openimport torchfrom torchtext.data import Dataset, Field, Example, Iteratorclass BABI20Field(Field):def __init__(self, memory_size, **kwargs):super(BABI20Field, self).__init__(**kwargs)self.memory_size = memory_sizeself.unk_token = Noneself.batch_first = Truedef preprocess(self, x):if isinstance(x, list):return [super(BABI20Field, self).preprocess(s) for s in x]else:return super(BABI20Field, self).preprocess(x)def pad(self, minibatch):if isinstance(minibatch[0][0], list):self.fix_length = max(max(len(x) for x in ex) for ex in minibatch)padded = []for ex in minibatch:# sentences are indexed in reverse order and truncated to memory_sizenex = ex[::-1][:self.memory_size]padded.append(super(BABI20Field, self).pad(nex)+ [[self.pad_token] * self.fix_length]* (self.memory_size - len(nex)))self.fix_length = Nonereturn paddedelse:return super(BABI20Field, self).pad(minibatch)def numericalize(self, arr, device=None):if isinstance(arr[0][0], list):tmp = [super(BABI20Field, self).numericalize(x, device=device).datafor x in arr]arr = torch.stack(tmp)if self.sequential:arr = arr.contiguous()return arrelse:return super(BABI20Field, self).numericalize(arr, device=device)class BABI20(Dataset):urls = ['http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz']name = ''dirname = ''def __init__(self, path, text_field, only_supporting=False, **kwargs):fields = [('story', text_field), ('query', text_field), ('answer', text_field)]self.sort_key = lambda x: len(x.query)with open(path, 'r', encoding="utf-8") as f:triplets = self._parse(f, only_supporting)examples = [Example.fromlist(triplet, fields) for triplet in triplets]super(BABI20, self).__init__(examples, fields, **kwargs)@staticmethoddef _parse(file, only_supporting):datanew, parse_story = [], []for line in file:tid, text = line.rstrip('\n').split(' ', 1)if tid == '1':parse_story = []if text.endswith('.'):parse_story.append(text[:-1])else:query, answer, supporting = (x.strip() for x in text.split('\t'))if only_supporting:substory = [parse_story[int(i) - 1] for i in supporting.split()]else:substory = [x for x in story if x]datanew.append((substory, query[:-1], answer)) # remove '?'parse_story.append("")return datanew@classmethoddef iters(cls, batch_size=32, root='.data', memory_size=50, task=1, joint=False,tenK=False, only_supporting=False, sort=False, shuffle=False, device=None,**kwargs):textnew = BABI20Field(memory_size)train, val, test = BABI20.splits(textnew, root=root, task=task, joint=joint,tenK=tenK, only_supporting=only_supporting,**kwargs)textnew.build_vocab(train)return Iterator.splits((train, val, test), batch_size=batch_size, sort=sort,shuffle=shuffle, device=device)

Loading the dataset using Keras

import reimport tarfileimport numpy as npfrom functools import reducefrom keras.utils.data_utils import get_filefrom keras.preprocessing.sequence import pad_sequencestry:path_new = get_file('babi-tasks-v1-2.tar.gz', origin='https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz')except:print('Error downloading dataset, please download it manually:\n''$ wget http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz\n''$ mv tasks_1-20_v1-2.tar.gz ~/.keras/datasets/babi-tasks-v1-2.tar.gz')raisereadfile= tarfile.open(path_new )

State of the Art

The current state of the art on bAbI dataset is STM. The model gave an accuracy of 99.85.

Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch, Keras, and TensorFlow – AIM (6)

Natural Questions

Natural Questions contains 307,373 questions for training, 7,830 questions for development, and 7,842 questions for testing, alongside human-annotated answers from Wikipedia pages, to be utilized in preparing Question Answer frameworks. This dataset is the first to repeat start to finish measure wherein individuals discover answers to questions. It was developed by the researchers: Lin Pan, Rishav Chakravarti, Anthony Ferritto and Michael Glass.

Loading the dataset using TensorFlow

import bertfrom bert import tokenizationimport tensorflow as tfimport tensorflow_hub as hubimport numpy as npimport hashlibimport globimport osfrom tensorflow.python.ops import math_opsfrom collections import Counterfrom tensorflow.metrics import accuracy%matplotlib inlineBERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"tf.enable_eager_execution()import jsonlines_train_file_path = '/Users/deniz/natural_questions/data/nq-train-*.jsonl'training = glob.glob(_train_file_path)examples = []for _train_file in train_files:print(training)with jsonlines.open(training) as reader:for i, example in enumerate(reader):# pop ununsed keysdel example['document_html']examples.append(example)

State of the Art

The current state of the art on Natural Questions dataset is GPT-3 175B. The model gave an accuracy of 29.9.

Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch, Keras, and TensorFlow – AIM (7)

Conclusion

In this article, we have covered some of the high-quality datasets that are used in Question Answering. Further, we implemented these data corpus using different Python Libraries. These datasets feature a diverse range of question and answer types. From the above result, we can see STM model performed exceptionally well on bAbI dataset with accuracy over 99%.

Access all our open Survey & Awards Nomination forms in one place

Ankit Das

A data analyst with expertise in statistical analysis, data visualization ready to serve the industry using various analytical platforms. I look forward to having in-depth knowledge of machine learning and data science. Outside work, you can find me as a fun-loving person with hobbies such as sports and music.

natural language processing, nlp in data, nlp pipeline, Pytorch, Question Answering, Tensorflow

AI Companies are Nothing Without PyTorch

Sagar Sharma05/07/2024

K L Krithika03/05/2024

PyTorch Releases Version 2.3 with Focus on Large Language Models and Sparse Inference

K L Krithika25/04/2024

Wayve AI Introduces LINGO-2, Making Driving Easy with Natural Language

Siddharth Jindal17/04/2024

PyTorch Releases torchtune for Easily Fine-Tuning LLMs

Mohit Pandey17/04/2024

In 5 Years, Coding will be Done in Natural Language

Mohit Pandey19/03/2024

Democratize data analysis and insights generation through the seamless translation of Natural Language into SQL queries

Anshika Mathews04/03/2024

What’s New in the Latest TensorFlow 2.16

Tasmia Ansari27/02/2024

Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch, Keras, and TensorFlow – AIM (17)

Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch, Keras, and TensorFlow – AIM (18)

Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch, Keras, and TensorFlow – AIM (19)

Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch, Keras, and TensorFlow – AIM (21)

19th - 23rd Aug 2024

Generative AI Crash Course for Non-Techies

Upcoming Large format Conference

Cypher 2024India's Biggest AI Summit

Sep 25-27, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

India Developers Like to be Exploited

Vandana Nair

Depending on the country, data crowdsourcing platforms can pay remote workers as little as $1.2/hour for a project.

The Chief Architect of Aadhaar Suggests Indian Govt to Offer ‘Compute as a Bond’ for Generative AI

Siddharth Jindal

Top Editorial Picks

Dhruva Space Receives IN-SPACe Authorisation for Ground Station as a Service

Shyam Nandan Upadhyay

Researchers Recreate Human Episodic Memory to Give LLMs Infinite Context

Donna Eva

Devin Autonomously Attempts to Charge $100 on Reddit to Create a Website

Donna Eva

Google in Advanced Talks to Acquire Wiz at $23 Billion

MIA

Researchers Make AI-Generated Board Games Using CodeLLaMa

Donna Eva

AMD Partners with IIT-Bombay to Boost Semiconductor Startups in India

MIA

Apple’s India Sales Soar to Record $8 Billion as China Diversification Continues

MIA

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration withNVIDIA.

Join the Community >>

GenAI
Corner

View All

Microsoft Introduces SPREADSHEETLLM for Efficient Spreadsheet Understanding

Vertiv Unveils Next-Gen UPS for High-Capacity AI Power Demands

BluSmart Raises $24 Million in Pre-Series B Funding to Accelerate EV Operations

UptimeAI Secures $14 Million in Series A Funding to Revolutionise Manufacturing with AI

10 AI Tools for Sales and Marketing Professionals

Vector Databases are Ridiculously Good

‘India’s UPI Moment in AI Will Be Driven by Usage, Not Production’

Top 10 Uncensored LLMs You Can Run on a Laptop

Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch, Keras, and TensorFlow – AIM (2024)

SQuAD

Loading the dataset using PyTorch

Loading the dataset using TensorFlow

State of the Art

bAbI

Loading the dataset using PyTorch

Loading the dataset using Keras

State of the Art

Natural Questions

Loading the dataset using TensorFlow

State of the Art

Conclusion

AI Companies are Nothing Without PyTorch

PyTorch Releases Version 2.3 with Focus on Large Language Models and Sparse Inference

Wayve AI Introduces LINGO-2, Making Driving Easy with Natural Language

PyTorch Releases torchtune for Easily Fine-Tuning LLMs

In 5 Years, Coding will be Done in Natural Language

Democratize data analysis and insights generation through the seamless translation of Natural Language into SQL queries

What’s New in the Latest TensorFlow 2.16

Top Editorial Picks

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

View All

References