Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Before you start to build out the data architectures of your cloud-scale analytics framework, review the articles in the following table.
Section | Description |
---|---|
Build an Initial Strategy | How to build your data strategy and pivot to become a data driven organization. |
Define your plan | How to develop a plan for cloud-scale analytics. |
Prepare analytics estate | Overview of preparing your cloud-scale analytics estate with key design area considerations like enterprise enrollment, networking, identity and access management, policies, business continuity and disaster recovery. |
Govern your analytics | Requirements to govern data, data catalog, lineage, master data management, data quality, data sharing agreements and metadata. |
Secure your analytics estate | How to secure analytics estate with authentication and authorization, data privacy, and data access management. |
Organize people and teams | How to organize effective operations, roles, teams, and team functions. |
Manage your analytics estate | How to provision platform and observability for a scenario. |
Physical architecture
The physical implementation of cloud-scale analytics consists of two main architectures: the data management landing zone and data landing zone.
Data applications
Data applications are a core concept for delivering a data product and can be aligned to both lakehouse and data mesh patterns.
Cloud-scale analytics
You can scale your cloud-scale analytics deployment by using multiple data landing zones.
Data mesh
Implement data mesh by using cloud-scale analytics. Although most cloud-scale analytics guidance applies, there are some differences to be aware of for data domains, self-serve data platforms, onboarding data products, governance, data marketplace, and data sharing.
Deployment templates for cloud-scale analytics
The following table lists reference templates that you can deploy.
Repository | Content | Required | Deployment model |
---|---|---|---|
Data management template | Central data management services and shared data services like data catalog and self-hosted integration runtime | Yes | One per cloud-scale analytics |
Data landing zone template | Data landing zone shared services, including ingestion, management, and data storage services | Yes | One per data landing zone |
Data integration template - batch processing | Additional services necessary for batch data processing | No | One or more per data landing zone |
Data integration template - stream processing | Additional services necessary for data stream processing | No | One or more per data landing zone |
Data product template - analytics and data science | Additional services necessary for data analytics and AI | No | One or more per data landing zone |
These templates contain Azure Resource Manager templates, the templates' parameter files, and CI/CD pipeline definitions for resource deployment.
Templates can change over time due to new Azure services and requirements. Secure each repository's main branch so it remains error-free and ready for consumption and deployment. Use a development subscription to test template configuration changes before you merge feature enhancements back into your main branch.
Connect to environments privately
The reference architecture is secure by design. It uses a multilayered security approach to overcome common data exfiltration risks.
The most simple security solution is to host a jumpbox on the virtual network of the data management landing zone or data landing zone to connect to the data services through private endpoints.
Frequently asked questions
For a list of questions and answers about cloud-scale analytics, see Frequently asked questions.