Our Mission

Key Success Stories

key decision science projects of converting raw data into actionable insights and product solutions by our founders.

Aircrafts Chit-Chat & Accidents – Probabilistic Topics Models

To developed a system to enhance the effectiveness of proposed policies on preventing aircraft accidents.

Insurance Dynamic Price Discovery Model

An insurance firm was facing the issue in developing a market linked compitetive pricing system for determining the premium for Director’s liability.

Predictive regression self-learning models were implemented while controlling environmental variables and removing seasonality. This framework was piped with realtime stream of data to provide a framework for insurance firm to price insurance package on any future request.

R based price discovery model framework to predict the premium for the director's laibility

R, Hadoop, MongoDB, Logistics Regression, Logistics Regression, Volatility Modeling, six-sigma, AIC/BIC Models .

US Mortgage Home Loan Portfolio Securitization

Mortgage trustees needed a smart decision making solutions for their mortgage loan portfolio based on the probability of default.

A open source smart decision making platform was developed. First, the loan level data was organized into a structured repository for a simplified view of mortgage home loan portfolio, which allows for effective surveillance and support performance analytics. Secondly, a proprietary, self-learning predictive financial model was developed to forecast the default rate of loans based on prior data.

A Predictive decision making engine to predict the default rate of loan from loan portfolio.

Python, R, Excel, Statistics, Logistics Regression, Volatility Modeling, AIC/BIC Models.

Reliability Score Framework

Development of unique relaibility score for CXO positions of listed companies based on earning conference call data.

Developed a novel framework to calculate reliability scores based on the acheivement of the projections made during Earning Conference Calls. The model was developed based on analysis of 10 years Earning conference calls transcript data where actual performance were compared with projections to attain the reliability scores. .

A unique relaibility score model which assigns relaibility scores to CXO positions based on earning conference call scripts.

Python, Semantic UI, JavaScript, Statistics.

Optimized Portfolio Creation and Analysis

Need to create a Restfull API platform which selects stocks from stock databases based on user requirements and build “Smart Beta” portfolio from these stock data-sets. The requirement was to have relative performance measure with respect to traditional market cap-weighted benchmarks which can be managed through tracking error control.

Developed an R Based library which builds custom datasets from stock universe based on user inputs. Using these datasets, the tool builds portfolio of various stocks based on user selected strategy and weighting scheme. This library also tracks performance of portfolio on live streamed data by generating performance report weekly. .

A dynamic R Based Framework for selection of Smart Beta portfolio.

Python, R, Excel, Statistics.

Corporate Social Responsibility (CSR) Ratings Model

Development of rating algorithm for CSR Initiatives.

Conceptualized the algorithm to statistically derive Corporate Social Responsibility Ratings of fortune 500 companies by performing textual analysis on 1000+ companies annual reports.

CSR Rating algorithm .

Python, Time Series Analysis, NLTK, Text Blob.

Smartphones Listening Social Media Analysis

A Giant Retailer wanted to find out the impact of new features on smartphones popularity launched over socail media. A detailed analysis was required to co-relate the social media sentiments of each brand with their sales across the globe.

Captured the social media conversation for 15+ smartphones brand and performed sentiment analysis, eracted top themes and other influencing insights.

A detailed report providing insights on co-relation of the social media sentiments of each brand and its impact on sales across the globe.

Python, JavaScript, D3.js, Twitter Rest API, Proproitry Sentiment engine.

Historical Patents Parsing

Google has released over 1000 Terabytes of patent data, but only 5% of it is available in structured form. There was a need to create a gold mine of parsed, clean and linked patent data, so that possibilities of missing innovations across the time can be figured out.

Developed an architecture to process historical patents from 1900 - present and extracted the structured patent information. The steps included fetching patent data from google storage, format classification, image processing, OCR and text parsing.

A framework to extract the structured patent information from historical patents from 1900 - present.

Python, Celery, Google Compute Engine, NLTK, Hadoo.

Geo-Location Based Brand Perception and Engagement

Need to create a Restfull API platform which selects stocks from stock databases based on user requirements and build “Smart Beta” portfolio from these stock data-sets. The requirement was to have relative performance measure with respect to traditional market cap-weighted benchmarks which can be managed through tracking error control.

Developed an R Based library which builds custom datasets from stock universe based on user inputs. Using these datasets, the tool builds portfolio of various stocks based on user selected strategy and weighting scheme. This library also tracks performance of portfolio on live streamed data by generating performance report weekly.

A dynamic R Based Framework for selection of Smart Beta portfolio.

Python, R, Excel, Statistics.

Entity Wise Sentiment Analysis

Researchers wanted to study the role of individuals/ entities in political upheavals across the world. The research purpose was to find out the patterns of conflicts and cooperation among the entities with time and respective reason for the change in the same.

Developed a novel engine, which infers the context free grammar from target infromation feeds to extract entities, related relevant subjective information and then statistically calculates sentiment among extracted entities. This analytical results were used to extract the network of key entities that appear in news articles and identify if there was Conflict or Cooperation among them, with the major factors that lead to change in their conflict cooperation relationship with time.

A novel engine which identifies the network of key entities that appear in news articles and acertain if there was Conflict or Cooperation among them and highlights the major factors that lead to change in their conflict/cooperation relationship with time. .

Python, Java, Stanford Core NLP, Google Compute Engine, Apache Tomcat Server, Flask, Google App Engine.

Color related Social Media Analytics

A US retailer was interested in finding out the influence of a product’s color on its popularity and sales among the customers.

Tweets for 1000+ products and 600+ brands were extracted for a period of one year totaling to about 100 million tweets. These tweets were analyzed for perceived adjectives, sentiments and color association towards the product. .

A detailed report providing insights on perceived adjectives, sentiments and color association towards the product.

Python, Java, Stanford CoreNLP, NLTK, Hadoop, Google Compute Engine.

Building Decision-Support Models to Assess Risk on its Banking System

How to assess risk of the whole banking system of a country on a daily basis?.

After piping in structured data from six different unstructured sources, a system-network was analyzed using granger causality and network analysis methods to come up with a risk index.

A User friendly customized dashboard illustrating real-time network of financial institutions along with risk index was deployed at the central bank for daily assessment of risk on national financial systems.

Python, Java, Stanford Core NLP, Google Compute Engine, Apache Tomcat Server, Flask, Google App Engine.

Last mile Route optimization Planner

Requirement of a statistical model framework for transportation planning to reduce travell time and optimize cost.

Developed an R Based library which builds custom datasets from stock universe based on user inputs. Using these datasets, the tool builds portfolio of various stocks based on user selected strategy and weighting scheme. This library also tracks performance of portfolio on live streamed data by generating performance report weekly.

A dynamic R Based Framework for selection of Smart Beta portfolio.

Python, R, Excel, Statistics.

Restaurant Feedback Textual Analysis

Researchers wanted to study the role of individuals/ entities in political upheavals across the world. The research purpose was to find out the patterns of conflicts and cooperation among the entities with time and respective reason for the change in the same.

Developed a novel engine, which infers the context free grammar from target infromation feeds to extract entities, related relevant subjective information and then statistically calculates sentiment among extracted entities. This analytical results were used to extract the network of key entities that appear in news articles and identify if there was Conflict or Cooperation among them, with the major factors that lead to change in their conflict cooperation relationship with time.

A novel engine which identifies the network of key entities that appear in news articles and acertain if there was Conflict or Cooperation among them and highlights the major factors that lead to change in their conflict/cooperation relationship with time.

Python, Java, Stanford Core NLP, Google Compute Engine, Apache Tomcat Server, Flask, Google App Engine.