Make-Sense-of-your-Data
Search
Close this search box.
Menu-Header

The Anatomy of a Data Science Project

We offer a complete solution for your manufacturing data science requirements, from the initial discovery phase through to solution deployment. Although no data science project is identical, they frequently consist of three stages, with stages 2 and 3 being optional, as outlined below.

  • Discovery: Understanding your requirements, reviewing your data and creating the necessary model.
  • Simulation: Testing the model in a safe environment to ensure it performs as expected.
  • Deployment: The final solution is installed on-site.
 
The steps for the individual stages are outlined below. 

1. Discovery

Industry-4.0-Discovery-Stage
1.1 Initial Consultancy

This is an initial call with the business stakeholder to identify:

  • Business requirements.
  • Is data science appropriate?
  • Do we have the necessary domain knowledge?
  • Is the necessary data available to support such a solution?
  • Potential services costs.
1.2 Problem Scoping

A discussion with the business stakeholders and subject matter experts to record:

  • Desired outcomes.
  • Data requirements and associated NDAs.
  • Data access.
  • Focal points of contact.
1.3 Data Export and Exploration

This involves exporting the data from the client, combined with data exploration.

In many cases, the data may be provided in numerous disparate data sources e.g., Excel workbooks, CSV files, SQL databases and archives from data historians. Thus, it may be necessary to combine them before starting data exploration.

Data exploration is used to:

  • Verify our understanding of the data.
  • Completeness e.g., check for nulls.
  • Identification and removal of outliers. 
  • Data consistency.
1.4 Initial Model Construction

This involves the data preparation and creation of an initial model. It is seen as a critical stage as it provides an early indication that the approach is valid.

It is possible that the results are poor, and the recommendation would be to terminate the project at this stage.

1.5 Enhanced Model Development and Testing

If the results from the previous step are encouraging, then the next step is model optimisation combined with extended testing. The previous stage only considered a single model which may be a sub-optimal solution. This step explores alternative models and ensembles methods to improve accuracy.

Once the optimal model had been determined further work may be carried out in respect of sensitivity analysis.

2. Simulation

Industry-4.0-SimulationDepending upon the original business requirements this stage may not be required.

The focus is now on using the robust, insightful model in a simulated environment. For example, it could be tested using out of sample data to see how it performs and assess the anticipated business benefits.

3. Deployment

Industry-4.0-DeploymentThis stage is again optional but if required, the model would be connected to real-time data from the site and the output from the model could be used in an advisory capacity. Under such circumstances, any recommendations could be either accepted or rejected by the relevant site personnel.

Alternatively, the model could be placed completely online whereas, as an example, it could directly interact with the automation systems without any human involvement.