DBpedia Neural Multilingual QA - GSoC project summery

August 29, 2020

GSoC blog

This blog includes a detailed description of the work I carried out during the GSoC period.
The above mentioned blog includes the tasks carried out on a weekly basis.

Brief Description of my work

Implement airML using KBox and pip.

This task was to create and distribute KBox into a pip package called airML, which will allow users to share and dereference ML models

Share the Monument dataset with airML.

Train the monument dataset and place it in a public repository and dereference it with airML.

Create the language detector dataset.

Iterate over all questions in IRbench , creating the dataset.

Create a language detector model and train it.

After creating the model, I evaluated other existing language models and wrote a research paper with my mentor.

Experimentation in Machine Translation methods.

Creating datasets.
Annotating Datasets.
Evaluating different methods.

The full description of Tasks carried out and the obtained results are included in this blog.
The results obtained regarding the research paper "Assessing the Efficiency of Language Identification Methods and Frameworks over Linked Data" is included here.

Code

Repositories

Since these Github repositories were created for the use of "DBpedia Neural Multilingual QA" GSoC project, the work here done in GSoC time period is completely done by me.

airML (https://github.com/AKSW/airML)
LangTagger (https://github.com/AKSW/LangTagger)
QATranslator (https://github.com/dbpedia/QATranslator)

My Pull Requests

Created interfaces for KBox. (https://github.com/AKSW/airML/pull/1)
Implementing install methods in airML. (https://github.com/AKSW/airML/pull/3)
Test function in airML. (https://github.com/AKSW/airML/pull/4)
The created language detector and existing language detector tools were evaluated. (https://github.com/AKSW/LangTagger/pull/1)
Completed the readme with completed results done on evaluation. (https://github.com/AKSW/LangTagger/pull/2)
Experimentation done in Machine Translation Task.(https://github.com/dbpedia/QATranslator)

TODO

Try out different datasets such as SQUAD datasets.
Try annotating the input language and target language texts separately using DBpedia spotlight.
Try OpenAI GPT-3 for question text generation.

Search This Blog

My GSoC blog

DBpedia Neural Multilingual QA - GSoC project summery

GSoC blog

Brief Description of my work

Code

Repositories

My Pull Requests

TODO

Comments

Post a Comment

Popular posts from this blog

Week from 07/12 to 07/18

Week from 07/19 to 07/25