Week from 05/08 to 05/15

Start of the week

The task given for this week is to implement a python application for machine learning trained models. For that, I used KBox(https://github.com/AKSW/KBox) as the base. I created an interface between KBox and airML (https://github.com/AKSW/airML) including useful functions for my GSoC project. 

Why?

That is very useful for reproducibility research, it allows other users to reuse models without having to re-train them. For now, I have trained the monument_300 dataset in NSPM.

How?

Instead of reimplementing everything from KBox, I implemented useful functions that allow me to share, locate and install trained models. These following functions are implemented by me:


list(kns=False)
Description:List all available models(kns=False) or list all KNS services(kns=True).
Args:
  kns:'boolean',defines whether to list only the KNS services or not
Returns:
    None
Throws:
    OSError
install(modelID, format=None, version=None)
Description:Install the a model by given modelID
Args:
    modelID: 'string', url of the model hosted in a public repository.
    format:  'string', format of the model.
    version: 'string' specific version to be installed of the the model.
Returns:
    None
Throws:
    OSError
Example:
    install("http://github.org/aksw/NSpM/monument_300","NSPM/Model","0")
install(modelID, format=None, version=None, kns=None)
Description:Install a given model base using the available KNS services to resolve it.
Args:
    modelID:'string',url of the model to be installed.
    format: 'string',format of the model.
    version:'string',version of the model.
    kns:'string', url of the kns service.
Returns:
    None
Throws:
    OSError
removeKNS(kns)
Description:Remove a given KNS service.
Args:
    kns:'string', url of the kns service.
Returns:
    None
Throws:
    OSError
getInfo(model, format=None, version=None)
Description:Gives the information about a specific model.
Args:
    model: url of the model to be installed.
    format: format of the model.
    version: version of the model.
Return:
    None
Throws:
    OSError
locate(modelID, format=None, version=None)
Description:Find the local address of the given model.
Args:
    modelID: 'string', url of the model to be located.
    format: 'string', format of the model.
    version: 'string', version of the model
Returns:
     None
Throws:
    OSError
search(pattern, format=None, version=None)
Description:Search for all model-ids containing a given pattern.
Args:
    pattern: 'string',pattern of the url of the models.
    format: 'string',format of the model.
    version: 'string',version of the model.
Returns:
    None
Throws:
    OSError
getModelDirPath()
Description:Show the path to the folder which contains the models.
Returns:
    Path of the installed models.
Throws:
    OSError
"""
setModelDirPath(dir)
Description:Change the path of the resource folder.
Args:
    dir:'string', new model path
Returns:
    None
Throws:
    OSError
"""
showVersion()
Description:Returns KBox version.
Returns:
    KBox version.
Throws:
    OSError


After implementing the above functions, I trained the monument_300 dataset in NSpM. Then I serialized the trained model and share it in http://figshare.com/. The model link is  https://figshare.com/articles/monument_300_model_zip/12303242.
And I installed the trained model using airML. 

Problems

During the activity, I faced some challenges.  
  • How to port arguments from Java to Python?
  • How to extract stdout from KBox jar?
  • How to distribute a Python package?

How did I solve the problems?

  • I solve the first problem through explicit system calls.
  • By using subprocess module in python. It allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. 
  • Used setuptools library to create and distribute the package.

Comments

Popular posts from this blog

DBpedia Neural Multilingual QA - GSoC project summery

Week from 07/12 to 07/18

Week from 07/19 to 07/25