We use Azure Storage service with most of our cloud projects. It offers a convenient way to store binary objects like images, videos, documents, etc and is very flexible with redundancy and more.
Azure Storage is very useful in combination with Azure Functions, where you need to run the function when a file is saved to a blob container. A function is triggered (BlobTrigger) and somehow the state (
file was already processed ) of that file has to be persisted.
Soooo…
…our blob container had 50k documents which were already processed by Azure Function so the state was known. It is stored inside azure-webjobs-hosts container in the storage account. If you open it and explore it’s content, you can see it contains few folders, the one we are looking for is blobreceipts. There are sub-folders for each container you have and one level further you have the name of the Azure Function and then Etag for every blob we stored in that container. When you finally drill down you get the blank files with the same name as the exact files stored in the actual container. Below are some screen short from the Storage Explorer application.
Let’s move to the fun part. We were doing business logic updates inside of Azure Functions and as a result we needed to migrate the blobs from one account to the other (plus renaming containers). You might think, simple enough, just copy everything over and that’s it. Well if you forget about the state data (blobreceipts folder), then all of the 50k files will be processed again.
Our functions are idempotent, but that is also not very optimal, because we do not want to run any logic and hit the database and other resources. Also
business logic for a specific file can take up to 10 minutes to process, so this is clearly a no-no.
The solution is very simple. Actually is really just one IF away 🙂! We modified all of our functions with an IF clause at the beginning of the Run method to simulate “NOOP” function like this:
1 2 3 4 5 6 7 8 9 10 11 12 |
public static void Run([BlobTrigger("%Container%/{name}")]Byte[] pdfFile, string name, ILogger log) { if(config["FunctionNoop"] == true) { log.Info("NOOP pass through."); return; } else { //Normal function process } } |
Every function has a configuration, where we can set the parameter to skip the processing of a function. It executes, writes a log and returns. This way the state of the file is regenerated/saved to the azure-webjobs-hosts container structure. For the time of migration we turned this configuration on, copied files from one blob container to another, BlobTrigger was triggered, function skipped processing and the file states were saved. After the migration was complete, we disabled the NOOP mode so the new files could be processed normally.
Check out Mr. Shaddad post for more information on how Azure Function blob triggers work.
Ta-da!
Leave a Reply
You must belogged in to post a comment.