Skip to content

+30%

Efficiency gains that improved institutional performance

Data Vault 2.0

Enhanced agility and flexibility in handling large data volumes with the new architecture, while minimizing legacy system technical debt.

Data Marketplace

A shopping experience when searching for relevant datasets while maintaining auditability of all transactions within the system. 

Get Ready to Started?

Data Vault 2.0 at NYU: A Higher Ed Case Study in Cutting-Edge Data Solutions

NYU Overview and the Importance of Data

New York University (NYU) is a prestigious institution founded in 1831, boasting three degree-granting campuses, 18 schools, 25 research programs, and 11 global sites. As the largest employer in New York City, NYU plays a significant role in the city’s economy and education sector. In recent years, data has become an essential product and asset at NYU as it plays a pivotal role in decision-making processes for schools and business units.

Data is particularly crucial during enrollment processes to ensure student success. The importance of data has been further highlighted during COVID-related activities when accurate information was needed to make informed decisions quickly. However, legacy architecture at NYU presented several challenges that hindered efficient data management.

Top 5 Legacy Pain Points

According to their Chief Architect, here are the top most challenges they were facing:

1. Technical Debt

They were running on legacy systems built 12-15 years ago. It became a challenge to manage both the projects and operation work at the same time. They kept building on other systems without going back to look at sustainability in the future. Previously, they had people seating through the night with batch processes that were failing. The decision to move to newer platforms and concepts eased up efforts from the data team and shifted their focus to management of processes that they are now able to sustain.

2. Multiple Data Warehouses

This is also related to technical debt. Every time they built a data warehouse, they didn’t discard the old one which led to duplicative processes and information collision. This resulted in data silos and made it difficult to get a holistic view of the organization’s data. Creating different kinds of versions of data which also leads us to the next pain point, no SVOT.

3. No Single Version of Truth (SVOT) for Data Sources

Disparate data sources that are not integrated will result in inconsistencies and inaccuracies in data. This in turn will make it challenging to make informed business decisions. There always has to be a plan or method on what is considered universal, otherwise data is perceived without common understanding.

They wanted to spend some time defining and contextualizing the data so they initiated their data governance platforms. They began to improve and expand on standards and processes. They also started to work on the business glossaries so that the data definitions are understood by everyone.

4. Convoluted Processes

Accessing data was a convoluted process. Their ETL processes were unmanageable and prone to failures. Since they were majorly batch dependent, they wanted to get out of a batch mode of operations.

5. Rigid Design that made it difficult to adapt to changes quickly

The technical processes were too rigid to be handled in an Agile fashion. The goal was to change the whole foundation of the design with Data Vault, to make it more flexible and manageable, even for future changes.

Eon Collective's Role in Addressing Legacy Architecture Challenges

Eon Collective stepped in to help NYU overcome these challenges by implementing Data Vault 2.0 methodology.

Some of NYU’s data team had attended training sessions and conferences on this approach. This involved investing in a platform for real-time data ingestion using change-data-capture (CDC) processes while utilizing serverless computing and API platforms for injection procedures.

1. Implementing Data Vault Concepts: Business Vault & Raw Vault

NYU built an S3 data lake and implemented the Data Vault concepts of business vault, raw vault, hubs, and links. The business vault stores enriched data that has been processed for easier consumption by end-users. In contrast, the raw vault contains unprocessed data from various sources in its original format. Hubs and links are used to establish relationships between different datasets within the Data Vault.

2. Adopting Automation Technology

To reduce time and resources spent on managing their new architecture, NYU adopted automation technology for a metadata-driven approach. This involved investing in an automation engine that allowed them to create reusable templates and standards for data-related processes while ensuring transparency through a robust data governance platform.

3. Data Marketplace: Shopping Experience & Auditability

NYU also developed a data marketplace that provided users with a shopping experience when searching for relevant datasets while maintaining auditability of all transactions within the system. This innovative approach helped streamline access to information across various departments at NYU.

Business Value from Leveraging Data Vault Methodology

The implementation of Eon Collective’s Data Vault 2.0 methodology brought significant benefits to NYU’s IT infrastructure and overall decision-making process across different departments. The new architecture provided agility and flexibility in managing vast amounts of information while reducing technical debt associated with legacy systems.

Faster and more flexible data ingestion processes allowed NYU to adapt quickly to changes without impacting their entire model significantly. Furthermore, improved transparency through robust governance platforms enabled better collaboration between IT teams responsible for managing infrastructure as well as business units relying on accurate information for decision-making purposes.

Higher education institutions generate vast amounts of data. In fact, it is estimated that a single large university can produce more data than the Library of Congress.

Solutions

Assessment Toolkit

Read More Customer Stories?​

Ready to Get Started?

Unlock the power of data-driven insights through tailored data solutions designed to
meet the unique needs of your organization.