The Genesis Of teX.ai
Many businesses have issues with analyzing their text data. These businesses cut across industries like Fintech, E-commerce, Market Research, Electronic Media, Legal etc.
Indium has extensive experience in leveraging NLP expertise to solve business critical problems across functions. After months of R&D and prototyping, we have launched our proprietary text analytics accelerator platform teX.ai.
teX.ai – An AI based Text Analytics suite of solutions that offers:
teX.ai is a SaaS based accelerator where Indium has working protypes for a variety of text analytics use cases some of which are mentioned below.
Legal organizations can achieve the following using teX.ai.
- Highlight the key words and phrases for each section of document using Name Entity Recognition
- Identify the latent topics inside your document w/o reading them using Topic Modeling
- Make a succinct list of mandatory and optional clauses (do’s and don’ts) using Clustering & ETL
For an electronic media news website, teX.ai can do the following:
- Automatic categorization across sports, entertainment, politics, pop-culture using Classification
- Filtering out of spam/abusive comments on the news items using Classification
- Automatic text summary of TL;DR articles using Summarization
Classify into different entities. Find the sentiment
[ Customer type as Irate, Satisfied, Indifferent ] [ +ve, -ve, 0 ]
Extract latent topics from the article
[ For banking complaints – Cust Care, A/c opening; Balance Related, Credit Card ]
Validate the text, numbers, table from a document with the sources
[ Match GDP numbers from govt. sources, economy ratings from ratings agencies]
Match similar words for spelling correction & data cleaning
[ Match “Amazn” to “Amazon” , “newyork” to “New York”]
Scraping to Structured data
Scrap data/ tables from varied sources – web, pdfs, images
[ Data tables with variables ]
Create concise summaries in to sentences or phrases
[ Summarize the article into 50+ characters ]
Exploratory Text Analysis
Search for words, frequencies, patterns, TF-IDF
[ List of important words and frequencies, Format – Table, Title, Text]
Generation - Chatbot
Generate automatic text and form Q&A Chatbot
[ Rasa NLU based chatbot, RNN]
Find the similarity b/w the documents and group similar docs
[ List of documents with similarity index and the topic name ]
Name Entity Recognition
Identify the entities under the different name classifiers
[ Location, Person, Organizations, Money, Date, Time ]
Text Data ETL
Storing text data in databases for efficient searching
[ Storage in ElasticSearch, Algolia, MongoDB]
For businesses considering text analytics services can see significant value in teX.ai such as:
- Shorter idea to solution deployment timeline for business problems
- Lesser R&D and development costs
- Access to text analytics solution frameworks and methodologies
- Access to a ready talent pool with proven expertise for a tough-to-hire skillset
Indium is excited about adding great value to our prospective client’s businesses and scaling our offerings using teX.ai. If you have a problem which falls in this domain, feel free to reach out to us and know more about it.
Natural Language Processing
Identification and Categorization
Optical Character Recognition (OCR)
Topic Modelling & Text Classification
Web Scraping packages
Financial Research Firms
A Few Examples of our Success
A client who was an ecommerce aggregator was having a hard time optimizing the search on their website. They had an overwhelming 100mn+ product listings across retailers with different catalogue category for the same product across different retailers. We used Naïve Bayes based text classification to categorize 100mn+ products to 5400 Google shopping categories. This improved their search results leading to 5x better conversions.
A marketing research client of ours was facing problems assessing the veracity of the metrics, indices and charts in their reports. We developed an NLP-based text scraping and validation solution which would validate the similarity between data sources like World Bank website on one hand and a PDF report on country-wise per capita income on the other. This reduced their man-hours burden to perform the manual QA drastically saving them big bucks.
A Fintech (employee loans) client of ours was about to touch a million active users. During the onboarding process, they used to ask the place of work from their users. Being a free-text field, it was prone to errors – some would misspell Amazon as ‘Amzon’ and others ‘Amazin’. They were missing out on critical insights about user’s place of work because of this issue. We brought Levenshtein Distance based text matching to their rescue. All such misspellings were traced back to ‘Amazon’. Now they could customize their marketing campaigns for organizations for better RoIs.
teX.ai is a suite of multiple use-cases. The endpoints of inputs and outputs will be customised as per your requirement. Clients usually come up with inputs of different formats, sizes and shapes. The outputs required are also of many forms like flat files, databases, feeds to other systems. Hence, we will customise the endpoints as per your requirements.
teX.ai can be immediately deployed in your systems and will work well if our existing inputs are matched to your inputs. It is a readymade product for many clients, however, some amount of customisation will solve the specific problems you visualise in your domain.
We have created teX.ai to cater to multiple domains. Conceptually the modules work well across various domains. For instance, the scraping module can be applied to any domain; the data validation case can be used for a product or service company.
The pricing model work in 2 different ways.
- Product cost + customisation
- Subscription model per file
In both the models, there is a customisation phase with a couple of Data Scientists to understand your requirement and tune the product to it.
In the first model, once the product is finalised, the product is yours to play with.
In the second model, you can use the application from the cloud to process your files.
Updates will be required only:
- When your requirements change.
- If there is an update in the development environment used, such as Python and R. In the case of code enhancement for the new program updates, we will replace it with a new product which may incur a nominal charge.
The product is a suite of solutions. The use-cases required by the client can be customised based on the output required. The time limit for every solution can range anywhere from 3 weeks to 3 months depending on the complexity of the requirement.
- Other free online tools have rigid process flows, formats and solutions. You need to upload documents in a specific format. teX.ai works on customising to your requirements.
- We have designed teX.ai with the purpose of solving multiple use cases with one solution. We understand that, with the text analytics, domain problems don’t come in single, the client wants to explore solutions for multiple problems.
Currently, we support extraction of text from images. Multiple files like xls, csv, pdf, doc, qif, jpeg are already supported using our product. At this point of time, teX.ai cannot extract data from videos.