With strategic interest and attention by many CEOs, it’s clear that data and the deep insights they can glean for the enterprise are essential competitive levers that can play transformative roles in shaping the organization forward.
Data today is the new currency that can unlock limitless new opportunities for business growth – whether for supply chain planning and inventory management, stronger and deeper customer relationship management, automation of core business processes, or simply for proactive decision-making to power revenue growth or margin performance. It’s an essential strategy for any organization looking to stay competitive and win new markets in 2023.
When leaders at a company start asking technology teams for data to answer their most pressing business questions, it initially appears as a relatively easy task to extract data from core systems to enable reporting and analytics. However, there a few different methods for extracting data that are ripe with complexities and variables.
In this point-of-view we will help you to understand, in the simplest of terms, the different types of extractions and shed light on the pros and cons of the various technology selections.
Approach #1: Manual or lower cadence extract
First, technology teams can use a manual/lower cadence extract to get started with their data analysis. This can be in the form of a manual SQL statement executed against a production system at a non-peak time, or it may involve running an existing front-end user report and downloading it into Excel.
This type of extract will give teams a way to start analyzing their data, which can be helpful since the ultimate solution may look very different from how one starts out. Also, this approach will ensure valuable time is not lost with a heavier technology solution that could require deeper understanding of database structures and data relationships. My advice to clients: always try get your hands on data first, and then worry about automating the solution at a later stage.
Approach #2: Production database backup solutions
A second approach that many pursue is to leverage production database backup solutions for data extract purposes. These CDC (change data capture) solutions are comprehensive – meaning they can include all the database structures in a company’s environment – and help teams avoid querying the production environment. These solutions are often already in place and can be quickly leveraged to support data analysis work. However, note, this path can be expensive and challenging to implement quickly if a solution is not already in place.
Approach #3: Read-only extracts
A final common approach is through leveraging read-only extracts. These can be done at low frequency (nightly/weekly/monthly), which is another standard way of extracting data from production systems. In this model, batch jobs are created that query the environment during off-peak hours, and are setup to capture changes since the last extract.
These processes can be helpful because they can leverage technology that may already be in place, such as scripts that run SQL (which can be relatively quick to enable). However, if the number of extracts grows quickly, job scheduling can become quite painful, as the timing of refreshes starts to become a constantly shifting exercise. Also, the methods for identifying what data have changed – typically via an update-time field – are manually configured by a programmer, and therefore have to be well-documented and understood.
We are seeing many of our clients enable read-only extracts at high frequency, using technology that manages scheduling and data change identification. Tools such as Keboola, Matillion and other Extract Transform Load (ETL) vendors are now embedding job scheduling, data change identification, and other convenient features directly into their platforms. These tools require some training and may be an introduction of new software into a company’s technology environment, but can be very useful when shifting from initial / manual extracts of data into more automated and reliable processes.
Start with the basics, with an eye towards automation for the longer-term
Data extraction can be a complex undertaking, but one which has become an essential activity as organizations of all types look to harness the power of their data to help drive towards new business goals. Determining the best approach for your organization is highly driven by your current tech landscape, data maturity levels and overall organization needs.
In summary, my best advice is to first focus on getting your hands on data, however you can initially. Then, when you really need to automate (e.g., when the benefits are worth the costs), determine the frequency of refresh required by the business and pursue more advanced capabilities for the longer-term. That will help you decide whether you need near real-time and more comprehensive data extracts, or less frequent and more targeted data feeds.
Start your journey today!
Our team at Cuesta is on the front lines in helping brands to transform their approaches to data – contact us today to learn more and to explore some of our latest case studies. We would be happy to share our learnings and experiences with you!