When configuring an Azure Blob storage, there will be a lot of settings and options presented. The purpose of this article is to navigate the different options and ensure the correct settings are used for the best performance in Popdock.
What is a Data Lake
A data lake is a single, centralized repository where you can store all your data, both structured and unstructured. A data lake enables your organization to quickly and more easily store, access, and analyze a wide variety of data in a single location. With a data lake, you don’t need to conform your data to fit an existing structure. Instead, you can store your data in its raw or native format, usually as files or as binary large objects (blobs).
Microsoft Learn
For Popdock the advantage of using a Data Lake for large data sets over storing that data in SQL or another application is the speed and flexibility. From a speed perspective, with a Data Lake we can read very large files quickly. Without worrying about API limits or memory restrictions on a server. Because all data in a Data Lake is stored in files, we don’t have a schema we need to fit the data into; instead we can work with any data structure, whether that is a standard ERP table, or a heavily customized system.
1. Create a Storage Account
First we need to create a Storage account in Azure. A Storage account is the environment that will hold our folders (Containers).
- Log into the Azure Portal.
- Click Create a Resource.
- Select the Storage account resource.
- Select an Azure Subscription and Resource Group into which to add the Storage account.
- Provide a unique Storage account name.
- Select a Region to store the account data.
- Set the performance to Standard.
- Note: Popdock currently only supports Standard performance.
- Set the desired Redundancy.
- Note: We recommend Geo-redundant storage (GRS).
- Click Next: Advance.
- On the Advanced tab > Security section.
- Check the box for Allow enabling anonymous access on individual containers.
- Check the box for Enable storage account key access.
- Navigate to the Hierarchical namespace section.
- Check the box for Enable hierarchical namespace.
- Navigate to the Blob storage section.
- Set the Access tier to Hot.
- Click Review.
- Click Create.
2. Create Container
A Container is essentially a folder, and each container can have separate security information allowing limited access to only some data in your Storage account.
Azure pricing is based on the amount of data stored not the number of Containers used. Add as many containers as need to organize your data.
- Search the Azure Portal for Storage accounts.
- Select your Storage account.
- From the left-hand navigation select Containers.
- Click + Container to add a new Container.
- Provide a Name for the Container.
- Set the Anonymous access level to Private (no anonymous access).
- Click Create.