The Genesis Of teX.ai
Many businesses have issues with analyzing their text data. These businesses cut across industries like Fintech, E-commerce, Market Research, Electronic Media, Legal etc.
Indium has extensive experience in leveraging NLP expertise to solve business critical problems across functions. After months of R&D and prototyping, we have launched our proprietary text analytics accelerator platform teX.ai.
teX.ai – An AI based Text Analytics suite of solutions that offers:
teX.ai is a SaaS based accelerator where Indium has working protypes for a variety of text analytics use cases some of which are mentioned below.
Legal organizations can achieve the following using teX.ai.
- Highlight the key words and phrases for each section of document using Name Entity Recognition
- Identify the latent topics inside your document w/o reading them using Topic Modeling
- Make a succinct list of mandatory and optional clauses (do’s and don’ts) using Clustering & ETL
For an electronic media news website, teX.ai can do the following:
- Automatic categorization across sports, entertainment, politics, pop-culture using Classification
- Filtering out of spam/abusive comments on the news items using Classification
- Automatic text summary of TL;DR articles using Summarization
Classify into different entities. Find the sentiment
[ Customer type as Irate, Satisfied, Indifferent ] [ +ve, -ve, 0 ]
Extract latent topics from the article
[ For banking complaints – Cust Care, A/c opening; Balance Related, Credit Card ]
Validate the text, numbers, table from a document with the sources
[ Match GDP numbers from govt. sources, economy ratings from ratings agencies]
Match similar words for spelling correction & data cleaning
[ Match “Amazn” to “Amazon” , “newyork” to “New York”]
Scraping to Structured data
Scrap data/ tables from varied sources – web, pdfs, images
[ Data tables with variables ]
Create concise summaries in to sentences or phrases
[ Summarize the article into 50+ characters ]
Exploratory Text Analysis
Search for words, frequencies, patterns, TF-IDF
[ List of important words and frequencies, Format – Table, Title, Text]
Generation - Chatbot
Generate automatic text and form Q&A Chatbot
[ Rasa NLU based chatbot, RNN]
Find the similarity b/w the documents and group similar docs
[ List of documents with similarity index and the topic name ]
Name Entity Recognition
Identify the entities under the different name classifiers
[ Location, Person, Organizations, Money, Date, Time ]
Text Data ETL
Storing text data in databases for efficient searching
[ Storage in ElasticSearch, Algolia, MongoDB]
For businesses considering text analytics services can see significant value in teX.ai such as:
- Shorter idea to solution deployment timeline for business problems
- Lesser R&D and development costs
- Access to text analytics solution frameworks and methodologies
- Access to a ready talent pool with proven expertise for a tough-to-hire skillset
Indium is excited about adding great value to our prospective client’s businesses and scaling our offerings using teX.ai. If you have a problem which falls in this domain, feel free to reach out to us and know more about it.
Natural Language Processing
Identification and Categorization
Optical Character Recognition (OCR)
Topic Modelling & Text Classification
Web Scraping packages
Financial Research Firms
A Few Examples of our Success
A client who was an ecommerce aggregator was having a hard time optimizing the search on their website. They had an overwhelming 100mn+ product listings across retailers with different catalogue category for the same product across different retailers. We used Naïve Bayes based text classification to categorize 100mn+ products to 5400 Google shopping categories. This improved their search results leading to 5x better conversions.
A marketing research client of ours was facing problems assessing the veracity of the metrics, indices and charts in their reports. We developed an NLP-based text scraping and validation solution which would validate the similarity between data sources like World Bank website on one hand and a PDF report on country-wise per capita income on the other. This reduced their man-hours burden to perform the manual QA drastically saving them big bucks.
A Fintech (employee loans) client of ours was about to touch a million active users. During the onboarding process, they used to ask the place of work from their users. Being a free-text field, it was prone to errors – some would misspell Amazon as ‘Amzon’ and others ‘Amazin’. They were missing out on critical insights about user’s place of work because of this issue. We brought Levenshtein Distance based text matching to their rescue. All such misspellings were traced back to ‘Amazon’. Now they could customize their marketing campaigns for organizations for better RoIs.