efficiency gains which improved institutional performance.
Ready to get started?
NYU Overview and the Importance of Data
New York University (NYU) is a prestigious institution founded in 1831, boasting three degree-granting campuses, 18 schools, 25 research programs, and 11 global sites. As the largest employer in New York City, NYU plays a significant role in the city’s economy and education sector. In recent years, data has become an essential product and asset at NYU as it plays a pivotal role in decision-making processes for schools and business units.
Data is particularly crucial during enrollment processes to ensure student success. The importance of data has been further highlighted during COVID-related activities when accurate information was needed to make informed decisions quickly. However, legacy architecture at NYU presented several challenges that hindered efficient data management.
Top 5 Legacy Pain Points
According to their Chief Architect, here are the top most challenges they were facing:
1. Technical Debt
They were running on legacy systems built 12-15 years ago. It became a challenge to manage both the projects and operation work at the same time. They kept building on other systems without going back to look at sustainability in the future. Previously, they had people seating through the night with batch processes that were failing. The decision to move to newer platforms and concepts eased up efforts from the data team and shifted their focus to management of processes that they are now able to sustain.
2. Multiple Data Warehouses
3. No Single Version of Truth (SVOT) for Data Sources
Disparate data sources that are not integrated will result in inconsistencies and inaccuracies in data. This in turn will make it challenging to make informed business decisions. There always has to be a plan or method on what is considered universal, otherwise data is perceived without common understanding.
They wanted to spend some time defining and contextualizing the data so they initiated their data governance platforms. They began to improve and expand on standards and processes. They also started to work on the business glossaries so that the data definitions are understood by everyone.
4. Convoluted Processes
5. Rigid Design that made it difficult to adapt to changes quickly
The technical processes were too rigid to be handled in an Agile fashion. The goal was to change the whole foundation of the design with Data Vault, to make it more flexible and manageable, even for future changes.
EON Collective's Role in Addressing Legacy Architecture Challenges
EON Collective stepped in to help NYU overcome these challenges by implementing Data Vault 2.0 methodology.
Some of NYU’s data team had attended training sessions and conferences on this approach. This involved investing in a platform for real-time data ingestion using change-data-capture (CDC) processes while utilizing serverless computing and API platforms for injection procedures.
1. Implementing Data Vault Concepts: Business Vault & Raw Vault
NYU built an S3 data lake and implemented the Data Vault concepts of business vault, raw vault, hubs, and links. The business vault stores enriched data that has been processed for easier consumption by end-users. In contrast, the raw vault contains unprocessed data from various sources in its original format. Hubs and links are used to establish relationships between different datasets within the Data Vault.
2. Adopting Automation Technology
3. Data Marketplace: Shopping Experience & Auditability
Business Value from Leveraging Data Vault Methodology
The implementation of EON Collective’s Data Vault 2.0 methodology brought significant benefits to NYU’s IT infrastructure and overall decision-making process across different departments. The new architecture provided agility and flexibility in managing vast amounts of information while reducing technical debt associated with legacy systems.
Faster and more flexible data ingestion processes allowed NYU to adapt quickly to changes without impacting their entire model significantly. Furthermore, improved transparency through robust governance platforms enabled better collaboration between IT teams responsible for managing infrastructure as well as business units relying on accurate information for decision-making purposes.
Higher education institutions generate vast amounts of data. In fact, it is estimated that a single large university can produce more data than the Library of Congress.