Python Azure Function App concurrency and differences between instance, worker and language worker

Question

Python Azure Function App concurrency and differences between instance, worker and language worker

Jan Kowalik 110

In Python Function Apps, what is the difference between a host, an instance, a worker, a worker process, a language worker and a thread? All are discussed in various related documentation pages but it is not clear to me if they talk about the same thing or not. How do they related to each other? How does one affect the other?

How do the below settings affect all of the above and each other?

maxConcurrentActivityFunctions
maxConcurrentOrchestratorFunctions
FUNCTIONS_WORKER_PROCESS_COUNT
PYTHON_THREADPOOL_THREAD_COUNT

Does it make sense to set threadpool count if using async python functions?

I had written a more detailed question here, but your system has deleted it for no reason really. Apparently violated community rules. If a detailed question is a violation I don't know how to ask questions here.

Mike Urnun 9,866 Reputation points Microsoft Employee

2023-10-20T03:17:45.0266667+00:00

Hello @Jan Kowalik Sorry for the long silence on this thread. I can still see the other post along with the details you've included there so I'll review and post an answer for you here.
Jan Kowalik 110 Reputation points

2023-10-25T14:50:30.05+00:00

@MikeUrnun Have you posted your answer to the other (the old) question? I can't see it as it shows as removed.
Jan Kowalik 110 Reputation points

2023-10-25T14:52:21.83+00:00

Can anyone answer the above question, or point to clear documentation about the configuration, please?
Jan Kowalik 110 Reputation points

2023-10-25T14:54:27.6833333+00:00

Is anyone able to shed some light on the topic and explain the config variables and, workers instances and threads, please?

Accepted answer

1 additional answer

Your answer

Mike Urnun 9,866 Reputation points Microsoft Employee

2023-10-20T03:17:45.0266667+00:00

Hello @Jan Kowalik Sorry for the long silence on this thread. I can still see the other post along with the details you've included there so I'll review and post an answer for you here.
Jan Kowalik 110 Reputation points

2023-10-25T14:50:30.05+00:00

@MikeUrnun Have you posted your answer to the other (the old) question? I can't see it as it shows as removed.
Jan Kowalik 110 Reputation points

2023-10-25T14:52:21.83+00:00

Can anyone answer the above question, or point to clear documentation about the configuration, please?
Jan Kowalik 110 Reputation points

2023-10-25T14:54:27.6833333+00:00

Is anyone able to shed some light on the topic and explain the config variables and, workers instances and threads, please?

Answer 1

Hello @Jan Kowalik - My apologies for the late reply, and I hope my answer below will still be helpful to you as well as others who may visit this thread for similar queries. Understanding the terms below is instrumental in correctly configuring your function app against the type of workload that'll be running on it and setting up an effective scaling strategy. When considering adding more CPU and Memory resources via either vertical or horizontal scaling, you first want to ensure that the most optimal configuration is already in place at the process level for your Function App so that the existing CPU/Memory is utilized as fully & efficiently as possible. At the high level, there are 3 distinct areas of context that layer on top of each other:

Program vs Process vs Instance vs Threads in the context of an Operating System
- This topic is fundamental to how any computer program runs on any Operating System & not specific to Azure Functions runtime. I'm afraid my answer as part of the Q&A forum reply simply will not do the justice it deserves as there is an abundance of content on the internet that explains how they work in a much better way, and it's up to you on what breadth and depth you want to research this area. That said, I find the following blog post as an easy read on these terms: What’s the Diff: Programs, Processes, and Threads
  When reading & researching, keep in mind that our focus is on exploring different ways a program can handle multitasking by (a) breaking down work via different Threads within a Process and (b) creating additional Instances of Processes, and (c) how the number of CPU Cores can bring true Parallelism to running multiple Processes.
Program vs Process vs Instance vs Threads in the context of Python language runtime
- After becoming comfortable with the above, naturally, we'll perhaps ask ourselves how programs built by different languages vary from each other in terms of the ways they consume CPU/Memory & handle multitasking; and there are many other aspects to compare & consider. But, on the matter of Concurrency and Parallelism of a Python runtime which is part of the premise of your question, Python is a single-threaded language runtime which means for every program built by Python, when in running state as a Process, there is only one thread spawned by the Process. As opposed to multi-threaded language runtimes like dotnet runtime, a work could be broken down & shared by many different Threads in parallel. Even though a normal Python program may run as a single-threaded process, in practice, there are multiple options to bring in parallelism, i.e.: ThreadPoolExecutor. The following article discusses these options: Python concurrency and parallelism explained
Host & Worker Processes in Azure Functions
- Architecturally, Azure Functions create 2 processes: Host and Worker processes. This is what enables Azure Functions to support many different languages with scale. When a successfully deployed Python Function App is running, you'll see a process(es) for the Host and one (or more) processes separately for the Python worker process(es) -- depending on the workload and how busy your app is. When the docs talk about worker processes and instances, they are referring to language runtime processes irrespective of regular stateless Azure Functions or stateful Durable Functions. For more about Architectural choices of Azure Functions, be sure to read the following resources:
  - Language Extensibility model & gRPC, protobuf
  - WebJobs SDK which the Azure Functions Host runs on.
  - Differences between isolated worker model and in-process model .NET Azure Functions

If all 3 areas above are clear and you're good with their concepts, the settings below are all that can be thought of as knobs that enable you to control concurrency, parallelism, processes, and threads:

PYTHON_THREADPOOL_THREAD_COUNT
- Because Python is a single-threaded process, you'll likely use the ThreadPoolExecutor to bring in parallelism. That is where the PYTHON_THREADPOOL_THREAD_COUNT setting comes into the picture for your Python Function App -- it enables you to specify the number of threads per Python worker processes base.
maxConcurrentActivityFunctions
- This is specific to Durable Functions & how it relies on an external storage account, known as Taskhub, to keep track of the current state & history of workload. If you're processing a large amount of work via Activity Functions, the work will be registered to TaskHub and the scaleController component will begin allocating more VMs that'll run more Worker processes to speed up the processing. You can use this setting to guide the ScaleController on when to add more Workers.
maxConcurrentOrchestratorFunctions
- This is the same as the above but applies to the Orchestrator functions.
FUNCTIONS_WORKER_PROCESS_COUNT
- This is not specific to Durable Functions but more for the normal stateless Functions. Being mindful of the Host and worker architecture, this setting enables you to specify anywhere between 1-10 number of Worker processes per one Host process.

Lastly, in addition to the above, I recommend reading the following 2 docs:

I hope my explanation makes sense and is helpful to your learning journey. Feel free to ask away in the comments below if any follow-up questions.

Please "Accept Answer" if the answer is helpful so that others in the community may benefit from your experience.

Mike Urnun 9,866 Reputation points Microsoft Employee

2023-11-13T19:34:24.3666667+00:00

@Jan Kowalik Regarding:

Also my understanding of the paragraph is that worker and thread here are controlled by FUNCTIONS_WORKER_PROCESS_COUNT and PYTHON_THREADPOOL_THREAD_COUNT respectively. Is that correct?

Yes, that is correct. The goal is to evenly distribute/load-balance the total/all available jobs/tasks residing in the Taskhub at a consistent throughput among all available workers. In other words, we want to give each job/task residing in the Taskhub the same chance to get picked up & processed by language workers, without having Workers ABC say "I can handle 4 jobs at a time concurrently but I will take on 6 jobs every time anyways where I'll complete 4 jobs first and get to the remaining 2 jobs right after" -- in this scenario, the 2 remaining jobs are unnecessarily loaded into the memory in stuck/reserved state for Workers ABC, when in reality, there could be other Workers XYZ picking up those 2 jobs from the TaskHub.
Mike Urnun 9,866 Reputation points Microsoft Employee

2023-11-13T19:42:04.3933333+00:00

As mitigation, the Language runtime considerations section attempts to communicate that If you had the throttling rate to match the concurrency rate that your language workers are set with, workers would not overcommit (or under-commit) and the overall throughput would be consistent.
Jan Kowalik 110 Reputation points

2023-11-16T08:01:42.51+00:00

@MikeUrnun thanks, but it still does not answer my question. Are the docs wrong stating that

1 thread on 4 language worker processes

with concurrency set to 4 on both throttles would not make the workers to over-commit?

In the described scenario wouldn't each worker want to take 4 jobs (as set in throttles) even though they have only 1 thread to process them?
Mike Urnun 9,866 Reputation points Microsoft Employee

2023-11-17T17:28:53.2+00:00

In the described scenario wouldn't each worker want to take 4 jobs (as set in throttles) even though they have only 1 thread to process them?

@Jan Kowalik I concur, let me take this internally & get back to you
David Justo 5 Reputation points Microsoft Employee

2023-11-21T17:58:42.66+00:00

(deleting as my previous comment finally posted.)
Mike Urnun 9,866 Reputation points Microsoft Employee

2023-11-21T19:22:31.25+00:00

Thanks, @David Justo , for providing clarity and confirming the distinction between Worker process vs Worker VMs in the context discussed in this thread.

@Jan Kowalik Hope the above clears things up for us. If all is good, as the next step, please consider opening up a new doc issue under the Performance and Scale doc by clicking on the "This Page" button located at the bottom of the page. This will create an internal work item for David and me to address the confusion that arose and rephrase these terms better & more explicitly.
Jan Kowalik 110 Reputation points

2023-11-22T18:36:21.6666667+00:00

Hi David and Mike,

Thank you for your support. It makes sense now. I opened an issue for the docs page as requested, you can find it here:

https://github.com/MicrosoftDocs/azure-docs/issues/117354

Answer 2

Eric Fan 0

Hi all:

 Very good topic.  I has a confusion about async and thread

As mentioned in the async section, the Python language worker treats functions and coroutines differently. A coroutine is run within the same event loop that the language worker runs on. On the other hand, a function invocation is run within a ThreadPoolExecutor, which is maintained by the language worker as a thread.

Does above mean async and increasing thread can't be used together? If using async, change PYTHON_THREADPOOL_THREAD_COUNT won't have any impact, no extra thread will be created.

Jan Kowalik 110 Reputation points

2025-05-02T21:07:27.6433333+00:00

In that scenario, I suppose increasing the thread count will let function app host/VM run multiple sync functions on those multiple threads of a single language worker. I am assuming FUNCTIONS_WORKER_PROCESS_COUNT=1. That would mean running multiple orchestrators at the same time for an instance. Also func app runtime distributes activity function invocations evenly across language worker threads. Where each thread is capable of running multiple async functions.

At least this is how I understand it.

The concurrency throttles limit how many orchestrator function invocations and activity function invocations are passed to a host/VM to process at the same time.

HTH

Share via

Python Azure Function App concurrency and differences between instance, worker and language worker

1 additional answer

Your answer