Summary
Our white paper explores the processes and opportunities presented by NLP for social media in extracting valuable data to drive improved operational and strategic decisions for R&D efforts.

Take Our Content to Go
Natural language processing (NLP) is one of the most promising social media data processing avenues. It is a scientific challenge to develop powerful methods and algorithms that extract relevant information from a large volume of data from multiple sources and languages in various formats or free form. The combination of NLP and machine learning can enable organizations to gain in-depth insights from unstructured data in ways they never have before.
The potential of NLP for social media can be seen in Mosaic’s involvement in a recent R&D effort with a prominent medical device company. The company gathered competitive information and market intelligence by combing publicly available data. In this data, social media posts were a key source of information.
The medical device company aimed to develop a capability to accelerate the research and development process by automatically gathering relevant social media posts, identifying and resolving entities of interest (companies, products, key people, and events) in posts, labeling posts by topic or research area, and aggregating information across posts to help surface broader trends and insights.
With ample experience developing and deploying advanced machine learning models, decision support capabilities, and custom NLP solutions for several customers, Mosaic Data Science is the ideal partner to build and automate a market intelligence toolset. Mosaic’s deep experience applying machine learning algorithms to critical business decisions enables the team to support and facilitate efforts to analyze unstructured social data using NLP for social media and deep learning, informing competitive intelligence.
In the next few sections, we explore how Mosaic would use NLP for social media to deliver robust market & competitive intelligence.
Creating an NLP Prototype for Social Media
The scale of such a project (number of social media posts or accounts analyzed) depends on the ability of the NLP automation to ingest posts sufficiently. In this context, building a preliminary prototype toolset for aggregating and processing relevant social media content provides immediate value by automating post collection and initial parsing while also providing the foundation for a more robust and complete capability.
An NLP for social media prototype can analyze social media by taking the below steps:
- Collecting relevant social media posts
- Identifying and resolving entities of interest within the posts
- Categorizing the posts by topic or research area
- Assessing the sentiment of each post toward the entities of interest
Preparing an NLP Model for Social Media Analysis
To kick off a successful NLP project, reviewing relevant documentation of current social media research workflows and past research analyses is important.
Next, Mosaic’s team will develop the pipeline to collect, clean, and store data from the social media API and other identified sources. The team must set up the necessary infrastructure to support data collection.
We must also identify entities of interest, such as companies, products, research areas, etc. A code can be implemented to map different ways of referencing a single entity (e.g., full company name, short name, stock ticker, etc.). Search queries can be generated based on hashtags, users, keywords, and other information to obtain the social media content relevant to R&D analyses related to these entities.
Categorizing Social Media Content in an NLP Environment
To easily identify social media content, labeling is key. One method categorizes the extracted posts according to the predetermined key focus areas. For example, some posts may be related to the review of a product or the entry of a new competitive player. Mosaic worked with the medical device company to understand the focus areas. Different techniques can be used to categorize the posts, such as source, contained hashtags, specific words mentioned, or a simple machine learning classifier if labeled data is easily available.
In addition, it is important to implement a sentiment model to assign positive, negative, or neutral sentiments to extracted posts. Mosaic can work with our customers to evaluate the need for a customized sentiment model, which would require data labeling and model training, versus being able to use a standard pre-trained model.
Sentiment analysis through NLP for social media is a proven way to draw insight from people and society. Instead of asking an analyst to spend weeks reading social media comments and providing a report, sentiment analysis can give you a quick summary, leading to saved resources and faster decisions.
For higher-level insights, Mosaic can design and develop analytics to aggregate information across processed tweets and generate higher-level insights. For example, this could include tracking the level of activity (posts, shares, comments, etc.) related to a company or the sentiment toward a new product launch over time. It is also possible for the NLP model to determine how customer personnel will query, consume, and interact with the information and insights generated.
Additional NLP for Social Media Opportunities
In addition to the applications outlined above, the developing of a bespoke deep learning-powered solution can also prove to be powerful for the following use cases:
Customer Service: Natural language processing solutions give businesses a reliable way to keep track of a large amount of user-generated material across several digital channels. Companies gain a competitive edge in actively engaging customers by replying to their queries or complaints using NLP techniques. NLP tools also assist businesses in installing various content filters, such as filtering undesired information or applying spam filters on websites or social media platforms.
Quality Control: NLP models are effective in tackling the quality of the content on different forums. They help prevent fraudulent backlinks or unsolicited advertisements, which often lead to poor customer experience. It can also auto-filter user-generated information that could be considered hate speech by detecting “prohibited” words.
Recruitment: Recruiting firms use NLP in extracting job descriptions and mapping them with a candidate skillset. Companies can use a database of applicants and successful hires for corporate hiring in their applicant tracking system.
How Mosaic Can Help
We have just barely scratched the surface of natural language processing capabilities. NLP is the cornerstone of numerous applications we use daily without even noticing, such as autocorrection, translation, chatbots, and much more. This powerful and emerging technology continues to evolve with the latest AI innovations in data science. The global NLP market was estimated at ~$5B in 2018 and is expected to reach ~$43B by 2025, and this exponential growth can mostly be attributed to the vast use cases of NLP in every industry today.
When working with uncertainty in requirements and scope for the NLP prototyping efforts outlined in this white paper, Mosaic can easily support our customers under the Rent a Data Scientist (RaDS) engagement model. The RaDS engagement model is a time and materials model that is flexible to project needs, with the level of effort able to scale up or down and personnel able to be added to the engagement with the customer’s approval if different skill sets were required at various points in the project.
Hiring a data scientist sounds expensive, but Mosaic frequently tackles these problems in an iterative engagement that starts with a proof of concept. Before moving to future iterations of model development, our team will collaborate closely with you to ensure the NLP model insights make sense to your business and align with your goals.
Ready to get started? Contact us today and discover how we can deliver a massive ROI for your organization.
 
													 
													