Mosaic helps organizations extract hidden insights and automate language tasks using AI.

Natural Language Processing & Advanced Language Modeling is Closer Than You Think

Mosaic helps our customers find the needle in the unstructured haystack, pulling insights from text, voice, audio, image, and speech to inform operational & strategic decision making across any business unit.

Our unique blend of data engineering, machine/deep learning, business acumen, and custom deployment experience lends itself towards a powerful consultative approach to building solutions that drive powerful results for our clients.

New to NLP & Text Sensing?

It is widely accepted that almost 80% of data a company collects is unstructured. Just think about the sheer amount of emails, resumes, text documents, research findings, legal contracts, invoices, call recordings, and social media posts your firm possesses.

If business could take full advantage of the value this information holds when they need it, they would be able to solve all sorts of challenges across the business.

Mosaic can help you implement your first Natural Language Processing application or improve an existing one, our NLP consulting and deployment services are tailored to you & your business processes, not the other way around. We bring the deep machine learning expertise necessary to quickly take advantage of these powerful algorithms.

Making Sense of Natural Language Processing

There is much to unpack in the world of Natural Language Processing. We sat down with Senior Data Scientist Alex Tennant for his perspective on the opportunities and complexities of NLP, and how Mosaic is paving the way for the consistent evolution of this powerful technology.

More Than Just Chatbots

If you were to ask five different people what Natural Language Processing (NLP) is, you would likely get five very different answers. The rise of personal assistants and chatbots have helped spread this technology into our everyday lives, but most businesses are barely scratching the surface of what is possible with these algorithms.

Mosaic prefers to categorize NLP into three high-level categories that are relevant to machine learning & text analytics applications:

Language Processing & OCR

Think of this as any text data a business is collecting, invoices, purchase orders, service agreements, social media posts, research documents, etc. NLP algorithms and Optical Character Recognition (OCR) technology allows a person to convert scanned documents into text searchable files, increasing the efficiency and effectiveness of combing through text data.

Natural Language Understanding

Using NLP techniques, it is easy to extract metadata from text such as entities, keywords, categories, emotion, relations, and syntax. Deep learning can encode the meanings of individual words and sentences in context or even of entire documents and use this information to categorize documents, extract relevant facts, or infer characteristics of the authors. Structured outputs of NLP models can even be used as inputs to downstream predictive machine learning models.

Natural Language Generation

Speech and text processing both analyze the structure of the data, but we humans do not produce language for the sake of analysis. We produce language as a communication tool and deep learning models need to reproduce this information as such. NLG automatically generates narratives that describe, summarize or explain input data in a human-like manner at the speed of thousands of pages per second.

Mosaic helps organizations design & deploy custom NLP applications that solve problems. In the illustration below we examine a sample set of those projects.

Drone Aircraft Control System
Speech Recognition

Mosaic developed an autonomous planning system using speech recognition and Air Traffic Controller domain supervisory control of an unmanned aircraft in a high-fidelity simulation.
Watch the Video →

Social Media Processing for Public Health
Information Extraction & Sentiment Analysis

Mosaic built a web-based tool for the CDC to understand how people manage long-term health conditions. People frequently turn to their social media platforms to discuss symptoms from various ailments.
Learn More →

Packaging Information Scan
Information Extraction

Mosaic has built a capability to automate the product packaging review, significantly reducing human review and improving accuracy of mislabeled packages.
Read the White Paper →

Quantifying Customer Interactions
Sentiment Analysis

Mosaic helped a leading retailer to understand the sentiment of customer service interactions through their call center, social media properties, surveys, and other unstructured text sources. Once sentiment was identified, they needed to be able to quantify how much the negative interaction had on their customer, Mosaic tied in Lifetime Value metrics to understand just how much a negative experience cost.

Mosaic has built a capability to automate the product packaging review, significantly reducing human review and improving accuracy of mislabeled packages.
Read the White Paper →

Purchase Order Anomaly Detection
Document Classification

Mosaic designed & deployed a custom ML application powered by NLP to identify anomalous purchase orders, cutting down on human review & increasing accuracy of POs needing attention, for one of the country’s largest hospital systems.
Read the Case Study→

Contextual Search for Research & Geological Info
Text Processing

Mosaic built text processing capabilities into a reverse image search application for a large multinational industrial firm to pull all relevant information to geographical locations around the globe, allowing business users to access this information in seconds rather than weeks.
Read the Case Study →

Additional NLP Resources

NLP & Language Based Models have advanced significantly over the past two years, read our latest musings on Transformer Models.
document summarization techniques blog header data points
Transformer and Computer-generated Document Summaries

Natural language models have come a long way in the past couple of years. With the advent of the deep learning Transformer architecture, it became possible to generate text that could, plausibly, be passed off as written by a human.

Skynet is close….GPT-3 is certainly pushing the limits on what is possible with text processing, read our review on GPT-3 and the future of NLP.
language model review data points
Skynet progress update: GPT-3

For those not in the know, the new GPT-3 is a massive language model trained on the entirety of the internet. The sheer size of GPT-3 alone is astounding, weighing in at 100x more parameters, and ingesting 100x more training data, than the previous generation of language models.

Mosaic lays out an automation approach to summarizing lengthy and complex papers with AI in our white paper.
Mosaic compiled a sheet examining the rise of NLP & intelligent text processing.

Natural Language Processing in 3 Steps

Extracting text from Pictures and Audio

Data extraction | Data Transformation | Linguistic Extraction | Text Extraction

Machine Learning Algorithm Prototyping & Validation using Converted Text Data

Summarization | Classification | Information Extraction | Contextual Search

Scalable Architecture to Support Real Time Search Queries

Cloud | Parallel Processing | Network Configuration

By bringing dynamic software development & expertise into deep learning development, Mosaic Data Science is poised to build a custom AI language processing system that will save your firm millions.