What is RNA Splicing and why is it needed?
The field of Bioinformatics has always been one where the level of technological advancements has been pathbreaking. Being a subdiscipline of biology and computer science, Bioinformatics deals with acquiring, storing, analyzing and disseminating biological data, more often than not, amino acids sequences and DNA.
Within this discipline comes RNA splicing. RNA splicing falls within gene transcription and in fact is a stage in gene transcription. It is the process by which information from a strand of DNA is duplicated into a new molecule of messenger RNA (mRNA).
Genetic material is safely and stably stored in the nuclei of cells as a template by DNA. Transfer of code from DNA to proteins is done by mRNA which is built in couple of stages.
Translation to pre-mRNA is done for every gene in the initial stage. The gene is split into 2 sections called exons (code sections) and introns (non-coding sections). The pre-mRNAs contain exons which are then joined via splicing.
The importance of RNA splicing is to eliminate the intervening introns (non-coding sequences) of genes from pre-mRNA and joining the exons (protein-coding sequences) in order for easy translation of mRNA into a protein.
The Use Case
In this context, Indium Software worked on a project for a client in this space. The client is a thriving genetic engineering company that drives research and innovation on RNA splicing errors. To extend their existing RNA research and therapeutics focus, the client intended to create a solution for RNA splicing errors leveraging analytics predictions.
To detect, catalog and interpret the pattern of RNA, the client had an existing application. However, they were facing performance issues while generating reports for the results of the experiments and was looking at significant wastage of time which they wanted to address.
Business Requirement and Implementation Strategy
In order to provide the ideal solution, we had to understand create a business case and understand the requirements which were as follows:
- Integrate R programming with the application, for report generation.
- Generate the report for the experiment within a minimum time frame.
- Deploy the application in Microsoft Azure cloud platform.
The thought process in order to have a successful implementation was that report generation be implemented within the existing Ruby on Rails (RoR) application by integrating an R engine, which is triggered internally with dynamic parameters. The reports will then be generated as HTML in R and then rendered in RoR.
Further to this, the application architecture had to be improved to achieve reduced complexity & improved file processing to quickly load and generate the reports. To achieve availability and maintainability of the application, deployment using dockers on cloud had to be done.
Indium’s best minds in data science and application development put their heads together to deploy the following solution to generate reports for the RNA sequence experiments:
- With the new requirements that came to light, we updated the report generation application built on RoR.
- For report generation, we incorporated an R programing engine. This would be triggered by the RoR application with dynamic parameters.
- The job the R engine performed was to read the input .txt file and create an HTML report. This would then be rendered using RoR in the app.
- We then Leveraged Ruby on Rails’ innate capabilities to split multiple tabs into individual rails, thereby reducing the wait time and facilitating the user to view the results faster.
- While running the report, a .txt file was dynamically generated and saved in the Rails repository. The same file would be overwritten when the user adds new input that eventually saves disk space.
- The .R data file was generated on the first run. This was done so that on the next consecutive run, the reports will load faster rather than reading through all files.
- The Rails application was then deployed using dockers on the Microsoft Azure Cloud.
Ensuring Customer Delight
Through our implementation, not only was it a project completed, but our client was amazed with the results we achieved. Some of the highlights of how our client benefitted from this are:
- The Report generation time was reduced to just 15 seconds (one fifth of what it was) by splitting multiple tabs into individual rails.
- The complexity of the Ruby on Rails code in the application was reduced significantly, improving ease of use and efficiency.
- The generation of .R Data file reduced the loading time of the Reports by nearly 30%.
- Dockerized Cloud deployment improved the availability and maintainability of the application.