Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add backups to workspaces. #4374

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

james-annages
Copy link

Resolves #4362

This pull request introduces significant updates to the base template for workspaces, primarily focusing on adding backup capabilities and enhancing the cleanup process for Azure Recovery Services Vaults. The key changes include the addition of new parameters and outputs, updates to the porter.yaml file for handling backups, and the creation of new Terraform resources for managing backups.

Backup and Recovery Enhancements:

Parameter and Schema Updates:

Terraform Configuration:

Role Assignments:

These changes collectively enhance the robustness of the workspace by adding comprehensive backup and recovery functionalities, ensuring that critical data can be protected and restored as needed.

What is being addressed

Added in a boolen for enable_backup that is set in the workspace config window. The system will deploy a recovery vault and the needed policy's.
It passes the names of the polices back out so they can be used by other services (sql, vm, etc).

How is this addressed

  • Added terrafrom code to deploy the needed objects. Changed some of the settings for deploying shared vm storage.
  • still needs a management ui for recovery, ability to set the retention and backup policy. And possibly give the ability to move the backups to a diffrent storage account for archive.
  • Workspace documentation needs to be updated. NOT COMPLETED YET!
  • base workspace have been bumped up. This needs to be fixed as it was changed for my testing.

Adding Backup vault to the base workspace.
Allows enableing or disableing of the vault. Also added a step to purge the vault as apart of the clean up and removal.
removed the depends on to the airlock as that maynot be enabled.
@github-actions github-actions bot added the external PR from an external contributor label Feb 17, 2025
removed the random new line that was added
Copy link
Member

@marrobi marrobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Couple of initial comments/questions.

@james-annages james-annages deleted the Add-backups branch February 18, 2025 15:42
@marrobi
Copy link
Member

marrobi commented Feb 18, 2025

@james-annages is there a reason that you closed the PR?

@james-annages
Copy link
Author

james-annages commented Feb 18, 2025

@marrobi Sorry for closeing the pull request, i messed up my commits and ended up with a 3 way split, that was not resolving corretly, i have fixed the merge and can repopen.

@marrobi
Copy link
Member

marrobi commented Feb 18, 2025

@marrobi Sorry for closeing the pull request, i messed up my commits and ended up with a 3 way split, that was not resolving corretly, i have fixed the merge and can repopen.

Be great if can reopen, as then the comments are preserved.

@marrobi
Copy link
Member

marrobi commented Feb 18, 2025

@james-annages let me know when you are sorted and will have another run through. Thanks.

@james-annages james-annages restored the Add-backups branch February 18, 2025 16:46
@james-annages james-annages reopened this Feb 18, 2025
Copy link
Member

@marrobi marrobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think getting closer, if can make the changes, and update the CHANGELOG.md file I will give it a test run.

Thank you!


}

resource "azurerm_backup_policy_vm" "vm_policy" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking , should the backup policy live with the workspace service, so for VMs, guacamole workspace service? Then each use resource is going to need to be protected?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

humm possibly a good call. I was thinking of doing it this way they they are there allready.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, then the frequency etc coudl be configured across the worksapce. Hmm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My worry is people think by ticking this box, everthing is being backed up. Many need a note to say "provided supported by the workspace services". The docs need to be clear.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aggreed. Would it be a possiblity to show a icon on ether user objets or workspace servise that show what is backed up. Also can add a tickbox per service that can enable/disable.
My hope was to add a tab that would show some details about backups/what is backed up. For now have it as show only.

@james-annages
Copy link
Author

found a fun bug when trying to clean up a workspace. It soft deletes the Azure storage container befor it deletes the backups so that failes.
protecteditems.ProtectedItemsClient#Delete: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="BMSUserErrorContainerNotUndeleted" Message="Container is not yet undeleted. Please undelete the container that contains this datasource before attempting to undelete the datasource."
Might need to purge a diffrent way? Any idea's?

@marrobi
Copy link
Member

marrobi commented Feb 19, 2025

found a fun bug when trying to clean up a workspace. It soft deletes the Azure storage container befor it deletes the backups so that failes. protecteditems.ProtectedItemsClient#Delete: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="BMSUserErrorContainerNotUndeleted" Message="Container is not yet undeleted. Please undelete the container that contains this datasource before attempting to undelete the datasource." Might need to purge a diffrent way? Any idea's?

A depends_on might work...

…rkspaces. may need to look at the script approch again.
@marrobi
Copy link
Member

marrobi commented Feb 20, 2025

@james-annages I'm going to give this a test today. Let me know if feel I shouldn't, will try look at the delete container issues.

@james-annages
Copy link
Author

@james-annages I'm going to give this a test today. Let me know if feel I shouldn't, will try look at the delete container issues.

@marrobi Give it a go, i think im allready using the the depends_on tags? but may see something i have missed.

@marrobi
Copy link
Member

marrobi commented Feb 20, 2025

@james-annages made a few changes, have put them in my branch here - f425e34

Can you pull them into yours?

Still testing delete, its getting further though. Thanks!

@marrobi
Copy link
Member

marrobi commented Feb 20, 2025

Ok, now in the place where as soft_delete_enabled = true is on the vault cannot be deleted until after the retention period So options are turn soft_delete_enabled off, or leave the recovery vault when delete the workspace.

@james-annages
Copy link
Author

Ok, now in the place where as soft_delete_enabled = true is on the vault cannot be deleted until after the retention period So options are turn soft_delete_enabled off, or leave the recovery vault when delete the workspace.

OK my script may be of help then.
If we leave the vault it will need the storage account and the rg to stay as well.
If we keep soft delete on it can protect from accidental deletion. I also think that it get readable on changes/additions to the vault (need to check the azure docs later).
It may be better to disable softdelete and see what happens when we backup a storage share then try and remove the workspace with the vault.

A possible middle ground could be a few commands in the uninstall step:

  • az login
  • turn off the soff delete setting
  • ???possible purge all backups???
  • terraform destroy.

@marrobi
Copy link
Member

marrobi commented Feb 20, 2025

I'm not sure you can disable soft delete.

If it's turned off, the delete seems to work, but i my test is hanging on private endpoint deleting, not sure if it's related.

I think if it works, we disable soft delete, get the PR in, and can start another discussion on if people think that should be changed.

@marrobi
Copy link
Member

marrobi commented Feb 20, 2025

If we were to leave the storage accounts we would start to hit quotas, especially if people have workspaces coming and going.

Can you test with the changes in my branch, with soft delete disabled.

It would actually be much easier with a vault in the core, btu then we have the multi subscription challenge. And it is quite tidy having a vault per workspace.

@james-annages
Copy link
Author

If we were to leave the storage accounts we would start to hit quotas, especially if people have workspaces coming and going.

Can you test with the changes in my branch, with soft delete disabled.

It would actually be much easier with a vault in the core, btu then we have the multi subscription challenge. And it is quite tidy having a vault per workspace.

In our TRE we have a vault in the core and are haveing problems with storage accounts with diffrent CMK on them to what the Vault has. (may be a isolated problem.) We can do storage account across diffrtent subs but not the vms or SQL.
possibly use a core vault for the shares? but then its how to you manage the backups and policys...
Not sure on this one.

@marrobi
Copy link
Member

marrobi commented Feb 20, 2025

If disable soft delete, can you get the workspace to deploy, do a backup, disable and delete?

I think that works for now.

I think you can change soft delete as long as it's not AlwaysOn.

Maybe as a next step we have it on, and turn it off as part of the workspace disable process.

@james-annages
Copy link
Author

@marrobi Just tested the changes from your branch and they look to have worked.
I think it might be worth looking at switching it off as part of the uninstall.
As we have needed the soft delete on the storage accounts when updates have redeployed the share.

@marrobi
Copy link
Member

marrobi commented Feb 20, 2025

Hmm, when have you seen a share get redeployed by an update?

@james-annages
Copy link
Author

@marrobi had it happen when we moved from v0.19.1 to v0.20.0 it deleted the shares then redeployed them.

@marrobi
Copy link
Member

marrobi commented Feb 20, 2025

😕

That's not good. So you upgraded the workspace and it deleted the shares? Any idea from what version to what?

Would like to track that down.

@marrobi
Copy link
Member

marrobi commented Feb 20, 2025

Did you turn CMK on maybe?

Can you create an issue for that, as we need to make sure it doesn't happen to others

@james-annages
Copy link
Author

I will do when I get in to the office tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external PR from an external contributor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Automated Backup Management for Shared VM Storage in Azure TRE
2 participants