-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
3d72c22
commit 96d4dec
Showing
22 changed files
with
629 additions
and
1,071 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# Data Governance | ||
|
||
Data governance (DG) is the process of managing the availability, usability, integrity and security of the [data](https://searchdatamanagement.techtarget.com/definition/data) in enterprise systems, based on internal data standards and policies that also control data usage. Effective data governance ensures that data is consistent and trustworthy and doesn't get misused. It's increasingly critical as organizations face new data privacy regulations and rely more and more on data analytics to help optimize operations and drive business decision-making. | ||
|
||
Ethical Principles around Data | ||
|
||
1. Autonomy - The right to control your data, possibly via surrogates | ||
2. Informed consent - You should explicitly appove use of your data based on understanding | ||
3. Beneficence - People using your data should do it for your benefit | ||
4. Non-maleficence - Do no harm | ||
|
||
## ODPi | ||
|
||
ODPi creates open source standards to help you use and understand data across all platforms. | ||
|
||
https://www.odpi.org | ||
|
||
https://searchdatamanagement.techtarget.com/definition/data-governance | ||
|
||
https://en.wikipedia.org/wiki/Data_governance | ||
|
||
https://www.oreilly.com/content/data-governance-and-the-death-of-schema-on-read | ||
|
||
![managing sensitive data](../../media/Pasted%20image%2020240228190110.png) | ||
|
||
![Data Governance](../../../media/Pasted%20image%2020240213122425.png) | ||
|
||
## Links | ||
|
||
[Designing Data Governance from the Ground Up • Lauren Maffeo & Samia Rahman • GOTO 2023 - YouTube](https://www.youtube.com/watch?v=A8dVHjRENBQ) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# Social Media Analytics Solution | ||
|
||
[Build and deploy a social media analytics solution - Azure Architecture Center | Microsoft Learn](https://learn.microsoft.com/en-us/azure/architecture/ai-ml/architecture/build-deploy-social-media-analytics-solution) | ||
|
||
[Social media analysis with Azure Stream Analytics - Azure Stream Analytics | Microsoft Learn](https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-twitter-sentiment-analysis-trends) | ||
|
||
![social media analytics solution architecture](../media/Pasted%20image%2020240227211925.png) | ||
|
||
### Dataflow | ||
|
||
1. Azure Synapse Analytics pipelines ingest external data and store that data in Azure Data Lake. One pipeline ingests data from news APIs. The other pipeline ingests data from the Twitter API. | ||
2. Apache Spark pools in Azure Synapse Analytics are used to process and enrich the data. | ||
3. The Spark pools use the following services: | ||
- Azure Cognitive Service for Language, for named entity recognition (NER), key phrase extraction, and sentiment analysis | ||
- Azure Cognitive Services Translator, to translate text | ||
- Azure Maps, to link data to geographical coordinates | ||
4. The enriched data is stored in Data Lake. | ||
5. A serverless SQL pool in Azure Synapse Analytics makes the enriched data available to Power BI. | ||
6. Power BI Desktop dashboards provide insights into the data. | ||
7. As an alternative to the previous step, Power BI dashboards that are embedded in Azure App Service web apps provide web and mobile app users with insights into the data. | ||
8. As an alternative to steps 5 through 7, the enriched data is used to train a custom machine learning model in Azure Machine Learning. | ||
9. The model is deployed to a Machine Learning endpoint. | ||
10. A managed online endpoint is used for online, real-time inferencing, for instance, on a mobile app (**A**). Alternatively, a batch endpoint is used for offline model inferencing (**B**). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,158 @@ | ||
# Solutions | ||
|
||
[Artificial intelligence (AI) architecture - Azure Architecture Center | Microsoft Learn](https://learn.microsoft.com/en-us/azure/architecture/ai-ml/) | ||
|
||
- Explore ideas about | ||
- Document processing | ||
- Content tagging with NLP | ||
- Knowledge mining for customer feedback | ||
- Large-scale custom NLP | ||
- Image processing | ||
- Image classification with CNNs | ||
- Retail assistant with visual capabilities | ||
- Visual assistant | ||
- Vision classifier model | ||
- Audio processing | ||
- Keyword digital text processing | ||
- Predictive analytics | ||
- Customer churn prediction | ||
- Personalized offers | ||
- Marketing optimization | ||
- Personalized marketing solutions | ||
- Chat bots | ||
- Search and query a knowledge base | ||
- AI at the edge | ||
- AI at the edge with Azure Stack Hub | ||
- Disconnected AI at the edge with Azure Stack Hub | ||
- Video ingestion and object detection on the edge | ||
- Document enrichment | ||
- AI enrichment with Cognitive Search | ||
- MLOps | ||
- Model deployment to AKS | ||
- Orchestrate MLOps with Azure Databricks | ||
- Deploy AI and ML at the edge | ||
- Many models ML with Spark | ||
- Many models with Machine Learning | ||
- Other ideas | ||
- Azure Machine Learning architecture | ||
- Autonomous systems | ||
- Data science and machine learning | ||
- Design architectures | ||
- Chat bots | ||
- Baseline end-to-end chat with OpenAI | ||
- Document processing | ||
- Automate document classification | ||
- Automate document processing | ||
- Automate PDF form processing | ||
- Build custom document processing models | ||
- Multiple indexers with Azure Cognitive Search | ||
- Video and image classification | ||
- Automate video analysis | ||
- Image classification | ||
- Audio processing | ||
- Speech transcription pipeline | ||
- Extract and analyze call center data | ||
- Predictive analytics | ||
- Determine customer lifetime and churn | ||
- Batch scoring | ||
- Batch scoring for deep learning | ||
- Batch scoring with Python | ||
- Batch scoring with R | ||
- Batch scoring with Spark on Databricks | ||
- Recommendations | ||
- Real-time recommendation API | ||
- [Social media analytics solution](ai/social-media-analytics-solution.md) | ||
- Monitoring | ||
- Monitor OpenAI models | ||
- Regulatory | ||
- Secure research for regulated data | ||
- Apply guidance | ||
- Machine learning options | ||
- Document processing | ||
- OpenAI GPT-3 summarization | ||
- Build language model pipelines | ||
- Audio processing | ||
- Custom speech-to-text overview | ||
- Custom speech-to-text | ||
- Conversation summarization | ||
- MLOps | ||
- Machine learning operations (MLOps) v2 | ||
- MLOps for Python models | ||
- Network security for MLOps | ||
- MLOps maturity model | ||
- Upscale ML lifecycle with MLOps | ||
- Team Data Science Process | ||
- Overview | ||
- Lifecycle | ||
- Overview | ||
- 1. Business understanding | ||
- 2. Data acquisition and understanding | ||
- 3. Modeling | ||
- 4. Deployment | ||
- 5. Customer acceptance | ||
- Roles and tasks | ||
- Overview | ||
- Group manager | ||
- Team lead | ||
- Project lead | ||
- Individual contributor | ||
- Development | ||
- Agile development | ||
- Collaborative coding with Git | ||
- Execute data science tasks | ||
- Code testing | ||
- Track progress | ||
- Operationalization | ||
- DevOps - CI/CD | ||
- Training | ||
- For data scientists | ||
- How To | ||
- Set up data science environments | ||
- Environment setup | ||
- Platforms and tools | ||
- Analyze business needs | ||
- Identify your scenario | ||
- Acquire and understand data | ||
- Ingest data | ||
- Overview | ||
- Move to/from Blob storage | ||
- Overview | ||
- Use Storage Explorermove-data-to-azure-blob-using-azure-storage-explorer.md | ||
- Use SSIS | ||
- Move to SQL on a VM | ||
- Move to Azure SQL Database | ||
- Move to Hive tables | ||
- Move to SQL partitioned tables | ||
- Move from on-premises SQL | ||
- Explore and visualize data | ||
- Prepare data | ||
- Explore data | ||
- Overview | ||
- Explore Azure Blob Storage | ||
- Sample data | ||
- Overview | ||
- Use Blob Storage | ||
- Use SQL Server | ||
- Process data | ||
- Access with Python | ||
- Use Azure Data Lake | ||
- Use SQL VM | ||
- Use data pipeline | ||
- Use Spark | ||
- Use Scala and Spark | ||
- Develop models | ||
- Engineer features | ||
- Overview | ||
- Deploy models in production | ||
- Build and deploy a model using Azure Synapse Analytics | ||
- OpenAI | ||
- Explore ideas about | ||
- Search and query a knowledge base | ||
- Design architectures | ||
- Baseline end-to-end chat with OpenAI | ||
- Extract and analyze call center data | ||
- Monitor OpenAI models | ||
- Apply guidance | ||
- Build language model pipelines | ||
- OpenAI GPT-3 summarization | ||
- Conversation summarization |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.