The code will provive you a step-by-step guide to create a DataMart with transport information.
This DataMart is build using Azure Data Factory for ELT/ETL and Azure Synapse as database. The final stage is a Power BI report reading the information from Synapse database.
Some important aspects in the solution is how to take advantage of dynamic content in Azure Data Factoty, creating a clean end-to-end solution.
The solution idea is to speed-up ETL / ETL development and simplify how you code your pipelines in Azure Data Factory. Also how you can take advantage of using a combination of Azure Synapse and Power BI as your standard serving layer solution.
The solution is splited in:
- Create Azure Services.
- Deploy database scripts.
- Deploy Azure Data Factory ARM template.
- Process data in Azure Data Factory.
- Import data into a Power BI template.
Understanding configuration table
- Access to Azure account with permission to create Azure services
- Permission to install PowerShell components in your local computer
- A Power BI Free or Pro license
You need around 1 hour to deploy entire solution.
The Azure Services can be created in Azure Portal or executing PowerShell scripts:
Open: Scripts location
Follow below steps to create Azure Services:
-
Create resource group.ps1
-
Create Storage Account.ps1
-
Create Azure Synapse Workspace.ps1
-
Create Azure SQL Pool.ps1
-
Create Azure Data Factory.ps1
Once you create Azure services, is necessary to deploy database objects. Please execute next steps:
Open: Scripts location
Follow below steps to create Azure Services:
-
Create schemas.sql
-
Create operational tables.sql
-
Create data warehouse tables.sql
-
Create stage tables.sql
-
Create external tables.sql
-
Insert dimensions data.sql
-
Insert configuration table.sql
This is the ARM template you are about to deploy:
Once you deployed Azure services and Synapse objects you need to deploy Azure Data Factory ARM template. Please execute next steps:
Open: ARM template location
1.1 - Before deploy ARM template file, you need to replace some values. There are 6 points to replace the information before deploy scripts. Please replace these values carfully:
- Once you replace the values with your own information, do the follow:
- Open your Azure Data Factory;
- On manage tab, select "ARM Template" option;
- Select option "Import ARM Template";
- On "Custom Deployment" editor, select option "Build your own template in the editor";
- Load the file "arm_template.json" and save;
- Select the resource group, region and factory name (same name you deployed with PoweShell);
- Once deploy is done, will be necessary to process the pipeline:
DO NOT FORGET TO ADD YOUR SYNAPSE PASSWORD AT SYNAPSE LINKED SERVICES
Is necessary to upload a file into the storage account. Please follow the instructions:
-
Download file named "taxi_zone_lookup.csv": Dowload CSV file
-
Upload file into this location in the storage account: "nyccabdata\ingestion\lookup\taxi_zone_lookup.csv"
After process execution, open Power BI template file and fill the parameters values:
Open: Power BI file template
- RangeStart: '2019-01-01 00:00:00.000'
- RangeEnd: '2019-01-01 23:59:59.000'
- ServerName: [Use your server name]
- DatabaseName: "TransportDW"
This Power BI file is configured with incremental refresh. You can refresh Power BI file and then deploy it to Power BI Service.
Please check: Code of conduct
Please check: Security
Please check: Support
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.