Your browser doesn't support the features required by impress.js, so you are presented with a simplified version of this presentation.
For the best experience please use the latest Chrome, Safari or Firefox browser.
1 / 43
EOC Talk
Saket Choudhary
Guide: Dr. Syed Asad Rahman
Project:
- A) Develop WebService For Atom to Atom Mapping and ReactionQuery
- B) Develop predictions machines for enzymes
WebServices
WebServices
Developed a 'RESTFul' WebService for Atom Atom Mapping and Reaction Similarity Search
WebServices : The Challenges
To develop a service that can be called from command line and have a WebFront(HTML) as well
WebServices : The Model
WebServices : The Model
The taks submitted by the user is executed as a background job,similar to submission on Farms
WebServices : The Model
We use threads. Jobs can take as long as 12 hours
WebServices : The Model
- Every job submitted has a unique ID associated(jobId)
- We use MySQL to keep track of these
WebServices : The Model
- User gets a success/error message on job submission(instantaeously)
- Atmost 10 jobs can be run at a time
- User can check the status of the submitted job
- An email is sent back to the user with the mapped reaction file and(or) reaction query results
WebServices: The Model
Prediction of Enzymes(Work in Progress)
Clustering Enzymes
- Wrote ARFF Generator to generate Attribute Relation File Format(WEKA Format) from the database tables
- This module allows to merge data from multiple files into an Attribute-Class relation for Enzymes
- Built a pipeline to convert the data from database to multiple formats:(csv,tab) and futher and R based pipeline to cluster these
Clustering Enzymes
- Ran clustering algorithms on EC6 class of Enymezs
- The clustering was performed using multiple hiererchial clustering and multiple distance methods
Clustering Enzymes
- To decide the best clustering method, we make use of Purity index
Purity = (Sum of maximum elements of a particular class in a cluster/ Total Number of Elements)
Clustering Enzymes
Hclust | Distance | Value | Purity |
Median | Manhattan | 1 | 9 |
Complete | Euclidean | 0.992 | 9 |
Complete | Minkowski | 0.991 | 14 |
Average | Euclidean | 0.944 | 8 |
Ward | Canberra | 0.976 | 10 |
Clustering Enzymes
- Clustering wsa performed using "pvclust" , a method in R for performing Agglomerative Clustering
- We pruned the result dataset where pvalues>=0.95
Clustering Enzymes:
Clustering Enzymes:
Clustering Enzymes:
Clustering Enzymes:
These methods orovide good clustering results based on the Purity value,
however there are other criteria for measuring purity of custers too which have not been considered(Mormalized Mutual Information,Radnom Index)
To Do
- Documentaion for WebServices
- Clustering of all ECs