Drew Clancy, Author at Mosaic Data Science

Data Debt

Data debt occurs when data is improperly handled at the technical level with the intention of postponing certain costs, even though the postponed costs will be higher, or the postponed benefits will be lower. The remainder of this document describes some important types of data debt.

By Drew Clancy, 12 years ago January 5, 2024

Blogs

Data Architecture 101, Part 4: Ontology-Driven Development is Lean

In software-development & data architecture nirvana, the business analysts, database technologists, and application developers all speak the same language. Everyone agrees about what each user story means.

By Drew Clancy, 12 years ago January 5, 2024

Blogs

Data Architecture 101, Part 3: Dimensions

Data marts, data warehouses, and some operational datastores use dimension tables. A dimension table categorizes a fact table that joins to the dimension. At query time one filters the facts by values in the dimension table, and uses those values to label the query results

By Drew Clancy, 12 years ago January 5, 2024

Blogs

Data Architecture 101, Part 2: Relational Architectures

This post uses those concepts to survey the main types of relational architectures. These divide fundamentally into two types, the second having four sub-types: OLTP & BI.

By Drew Clancy, 12 years ago January 5, 2024

Blogs

The Role of Industry Experience in Data Science

In this post we explain why the assumption about industry experience is outdated—why often industry experience detracts from the best possible application of data science.

By Drew Clancy, 12 years ago January 5, 2024

Blogs

Data Science Design Pattern #5: Combining Source Variables

Variable selection is perhaps the most challenging activity in the data science lifecycle. Our blog highlights a repeatable approach to variable engineering.

By Drew Clancy, 12 years ago January 5, 2024

White Papers

Predicting Employee Churn

This white paper examines a machine learning approach to predicting employee churn and optimizing for retention.

By Drew Clancy, 12 years ago May 16, 2024

Blogs

Data Science Design Pattern #4: Transformations of Individual Variables

n this post we describe some common ways to transform individual variables, and explore how doing so may benefit an analysis.

By Drew Clancy, 12 years ago January 5, 2024

Blogs

The Executive Role in a Data-Driven Organization

Our blog post examines the role of an executive in a data-driven organization.

By Drew Clancy, 12 years ago January 5, 2024

Blogs

Data Science Design Pattern #3: Handling Null Values

Most data science algorithms do not tolerate nulls (missing values). So, one must do something to eliminate them, before or while analyzing a data set.

By Drew Clancy, 12 years ago January 5, 2024

Drew Clancy

Have questions? Schedule a meeting below