Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databricks+ADF E2E Sample (Milestone 1) #782

Open
26 of 32 tasks
ydaponte opened this issue Nov 5, 2024 · 0 comments
Open
26 of 32 tasks

Databricks+ADF E2E Sample (Milestone 1) #782

ydaponte opened this issue Nov 5, 2024 · 0 comments

Comments

@ydaponte
Copy link
Collaborator

ydaponte commented Nov 5, 2024

Milestone 1: mid-Jan.

E2E Sample "Technical Problem"

  • Technical Problem: Fix the ability for the sample to be deployed e2e and modernize sample with latest features from Databricks and ADF
  • Persona: Data Engineers and Developers

IaC

  • Fixing sample devcontainer
  • Generate data that is independent from the public REST API (it can be re-used by the other E2E sample) and allowing batch and streaming modes to ingest the data
  • Fixing bicep scripts that are failing with the latest versions and newer schemas (Synapse SQL Dedicated pool)
  • Fixing broken main ./deploy.sh script when configuring the Databricks workspace for dev, stg and prod
  • Upgrade to the latest Databricks CLI as the previous version is no longer working for certain parts of the code
  • Incorporate the deployment of a Unity Catalog
  • Fixing Key Vault Soft-Delete capability in the clean-up and deployment scripts
  • Act on issues identified on the ISE Checklist (Other fundamentals)
  • Document all changes, including architecture schemas when applicable
  • [27.11.2014]. As we have more people contributing, we added some more improvements to the milestone as for example readme and guidance improvements, upgrading bicep versions used in the sample and replacing sql authentication by only entra id authentication.

CI/CD

  • Incorporate Asset Bundles to the CICD Databricks process
  • Act on issues identified on the ISE Checklist (Other fundamentals)
  • Document all changes

Security

  • Start and act on the ISE Checklist on the Security pillar
  • Document all changes
  • Change SQL authentication by Entra only authentication

Observability

  • Act on issues identified on the ISE Checklist (Observability)
  • Document all changes, including architecture schemas when applicable

Testing

  • Document all changes, including architecture schemas when applicable

Python Package modernization

For Package development, update to latest / best practices. Align with Fabric update team on python package modernization.

VNext Milestone ideas:

  • Adding Autoloader to ingestion
  • Adding Industry packages to the sample
  • Add a dependency map of the versions across the sample

Ensure you've read the Contributing Guide (Microsoft Internal only.)

IaC

CICD

Security

Observability

Testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant