DBpedia Neural Multilingual QA - GSoC project summery

 GSoC blog

  • This blog includes a detailed description of the work I carried out during the GSoC period.
  • The above mentioned blog includes the tasks carried out on a weekly basis.

Brief Description of my work

  • Implement airML using KBox and pip. 
    • This task was to create and distribute KBox into a pip package called airML, which will allow users to share and dereference ML models
  • Share the Monument dataset with airML.
    • Train the monument dataset and place it in a public repository and dereference it with airML.
  • Create the language detector dataset.
    • Iterate over all questions in IRbench , creating the dataset.
  • Create a language detector model and train it.
    • After creating the model, I evaluated other existing language models and wrote a research paper with my mentor.
  • Experimentation in Machine Translation methods.
    • Creating datasets.
    • Annotating Datasets.
    • Evaluating different methods.
  • The full description of Tasks carried out and the obtained results are included in this blog.
  • The results obtained regarding the research paper "Assessing the Efficiency of Language Identification Methods and Frameworks over Linked Data" is included here.

Code

    Repositories

        Since these Github repositories were created for the use of  "DBpedia Neural      Multilingual QA" GSoC project, the work here done in GSoC time period is         completely done by me.

    My Pull Requests    

TODO

  • Try out different datasets such as SQUAD datasets.
  • Try annotating the input language and target language texts separately using DBpedia spotlight.
  • Try OpenAI GPT-3 for question text generation.

Comments

Popular posts from this blog

Week from 07/12 to 07/18

Week from 07/19 to 07/25