DBpedia Neural Multilingual QA - GSoC project summery
GSoC blog
- This blog includes a detailed description of the work I carried out during the GSoC period.
- The above mentioned blog includes the tasks carried out on a weekly basis.
Brief Description of my work
- Implement airML using KBox and pip.
- This task was to create and distribute KBox into a pip package called airML, which will allow users to share and dereference ML models
- Share the Monument dataset with airML.
- Train the monument dataset and place it in a public repository and dereference it with airML.
- Create the language detector dataset.
- Iterate over all questions in IRbench , creating the dataset.
- Create a language detector model and train it.
- After creating the model, I evaluated other existing language models and wrote a research paper with my mentor.
- Experimentation in Machine Translation methods.
- Creating datasets.
- Annotating Datasets.
- Evaluating different methods.
- The full description of Tasks carried out and the obtained results are included in this blog.
- The results obtained regarding the research paper "Assessing the Efficiency of Language Identification Methods and Frameworks over Linked Data" is included here.
Code
Repositories
Since these Github repositories were created for the use of "DBpedia Neural Multilingual QA" GSoC project, the work here done in GSoC time period is completely done by me.
- airML (https://github.com/AKSW/airML)
- LangTagger (https://github.com/AKSW/LangTagger)
- QATranslator (https://github.com/dbpedia/QATranslator)
My Pull Requests
- Created interfaces for KBox. (https://github.com/AKSW/airML/pull/1)
- Implementing install methods in airML. (https://github.com/AKSW/airML/pull/3)
- Test function in airML. (https://github.com/AKSW/airML/pull/4)
- The created language detector and existing language detector tools were evaluated. (https://github.com/AKSW/LangTagger/pull/1)
- Completed the readme with completed results done on evaluation. (https://github.com/AKSW/LangTagger/pull/2)
- Experimentation done in Machine Translation Task.(https://github.com/dbpedia/QATranslator)
TODO
- Try out different datasets such as SQUAD datasets.
- Try annotating the input language and target language texts separately using DBpedia spotlight.
- Try OpenAI GPT-3 for question text generation.
Comments
Post a Comment