Data Factory

Data Factory is an open framework and toolkit for creating data flows to collect, inspect, process and publish data. You can fully automate your data collection, processing, and publication with the Data Factory.

Why use an open framework?

No proprietary tooling and no vendor lock-in

Easily extensible for adding custom functionality

All outputs and artifacts are standards-based and fully portable

The Flow of the Data Factory Process

We have created an open ecosystem of tools and specifications for powerful and frictionless data processing flows.

01Loading data from various sources and file types

02Normalizing, cleaning and tidying the data - making it documented and portable

03Transforming the data - changing the structure and/or the contents of the data, combining with other datasets etc.

04Making sure that the data is correct, valid and adheres to your own verification rules

05Storing the processed data in any file or data storage system

Why Data Factory?

A professionally selected collection of the best tools and practices

An end-to-end solution

All parts are fully integrated

Backed by a team of professionals with years of
experience in similar projects

What does Data Factory contain?

The Data Package Standard - a mature and field-tested container for any sort of data.

The Frictionless Data Toolkit - a rich library of integrations and adapters to work with data packages nearly everywhere.

The DataFlows Framework - a powerful engine for creating and stream-processing data packages.

GoodTables - a thorough validation tool to make sure your data is always in good shape and form.

Use Data Factory for your project

Join the chat to learn more

Get our
Professional Support

Contact Us