.. MediCurator documentation master file, created by
   sphinx-quickstart on Thursday, August 6.
   You can adapt this file completely to your liking, but it should at least
   contain the root `toctree` directive.

******************************************************************************
MediCurator : Near Duplicate Detection for Medical Data Warehouse Construction
******************************************************************************

Welcome to the MediCurator Documentation.
Here you will find information describing the features of the MediCurator platform,
tips on how to use it, and details about its RESTful API.

.. image:: deployment.png
   :scale: 100
   :align: center

With the growing adaptation of pervasive computing into medical domain and
increasingly open access to data, near duplicate detection algorithms have been
proposed and implemented in order to detect and eliminate duplicate entries from
massive datasets.

Traditionally, near duplicate detection algorithms are sequential and operate on
a single computer. Now, InMemory Data Grids (IMDG) offer a distributed storage
and execution, giving the illusion of a single large computer over multiple
computing nodes in a cluster. However, common distribution strategy and framework
to parallelize the execution of the near duplicate detection algorithms is still
lacking.

MediCurator is a near duplicate detection framework for heterogeneous medical data
sources in constructing data warehouses. MediCurator has been developed to retrieve
medical data from various data sources, including MySQL, MongoDB, CSV files, and
medical image archives such as TCIA, and detect the duplicates in-memory, while
storing the merged data into data warehouses hosted in Hadoop Distributed File
System (HDFS).

This documentation is intended to serve both the MediCurator developers/deployers
as well as the MediCurator users.

Getting Started With MediCurator
################################

You may download and build MediCurator from its source code, which is readily avilable at

https://bitbucket.org/BMI/medicurator

The source code of this documentation can be found at,
https://github.com/Ireneruru/MediCurator-Readthedocs/tree/master/docs


This documentation is currently hosted at,
http://medicurator-readthedocs.readthedocs.io/


MediCurator Research
####################

.. toctree::
   :maxdepth: 3

   sections/Usecase
   sections/About-MediCurator


MediCurator Installation
########################

.. toctree::
   :maxdepth: 3

   sections/Installation


MediCurator for Users
##################

You can easily use MediCurator, because it is very user-friendly. It offers two main ways for you to choose, Rest API and Web-Application.

.. toctree::
   :maxdepth: 4

   sections/MediCurator-REST-API
   sections/Web-Application


MediCurator for Developers
#######################
MediCurator version 1.0 has extensively been developed for some specific environments while maintaining relevant interfaces for extension to the other conditions.

.. toctree::
   :maxdepth: 3

   sections/Image formats
   sections/Data-Sources

Citing MediCurator
##################

If you have used MediCurator in your research, please cite the below papers:

[1]Kathiravelu, P. & Sharma, A. (2016). Near Duplicate Detection for Medical Data Warehouse Construction.
In AMIA 2016 Joint Summits on Translational Science. March 2016.

[2] Kathiravelu, P. & Sharma, A. (2015). MEDIator: A Data Sharing Synchronization Platform for Heterogeneous Medical Image Archives. In Workshop on Connected Health at Big Data Era (BigCHat‘15), co-located with 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2015). Aug. 2015. ACM. 6 pages. http://doi.org/10.13140/RG.2.1.3709.4248

[3] Kathiravelu, P. & Sharma, A. (2016). SPREAD - System for Sharing and Publishing Research Data. In Society for Imaging Informatics in Medicine Annual Meeting (SIIM 2016). June 2016. http://c.ymcdn.com/sites/siim.org/resource/resmgr/siim2016abstracts/Research_Kathiravelu.pdf