Python Azure Function App concurrency and differences between instance, worker and language worker

Jan Kowalik 110 Reputation points
2023-10-10T09:43:51.4433333+00:00

In Python Function Apps, what is the difference between a host, an instance, a worker, a worker process, a language worker and a thread? All are discussed in various related documentation pages but it is not clear to me if they talk about the same thing or not. How do they related to each other? How does one affect the other?

How do the below settings affect all of the above and each other?

  • maxConcurrentActivityFunctions
  • maxConcurrentOrchestratorFunctions
  • FUNCTIONS_WORKER_PROCESS_COUNT
  • PYTHON_THREADPOOL_THREAD_COUNT

Does it make sense to set threadpool count if using async python functions?

I had written a more detailed question here, but your system has deleted it for no reason really. Apparently violated community rules. If a detailed question is a violation I don't know how to ask questions here.

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
5,723 questions
{count} vote

Accepted answer
  1. Mike Urnun 9,866 Reputation points Microsoft Employee
    2023-11-10T20:49:05.4066667+00:00

    Hello @Jan Kowalik - My apologies for the late reply, and I hope my answer below will still be helpful to you as well as others who may visit this thread for similar queries. Understanding the terms below is instrumental in correctly configuring your function app against the type of workload that'll be running on it and setting up an effective scaling strategy. When considering adding more CPU and Memory resources via either vertical or horizontal scaling, you first want to ensure that the most optimal configuration is already in place at the process level for your Function App so that the existing CPU/Memory is utilized as fully & efficiently as possible. At the high level, there are 3 distinct areas of context that layer on top of each other:

    • Program vs Process vs Instance vs Threads in the context of an Operating System
      • This topic is fundamental to how any computer program runs on any Operating System & not specific to Azure Functions runtime. I'm afraid my answer as part of the Q&A forum reply simply will not do the justice it deserves as there is an abundance of content on the internet that explains how they work in a much better way, and it's up to you on what breadth and depth you want to research this area. That said, I find the following blog post as an easy read on these terms: What’s the Diff: Programs, Processes, and Threads
        When reading & researching, keep in mind that our focus is on exploring different ways a program can handle multitasking by (a) breaking down work via different Threads within a Process and (b) creating additional Instances of Processes, and (c) how the number of CPU Cores can bring true Parallelism to running multiple Processes.
    • Program vs Process vs Instance vs Threads in the context of Python language runtime
      • After becoming comfortable with the above, naturally, we'll perhaps ask ourselves how programs built by different languages vary from each other in terms of the ways they consume CPU/Memory & handle multitasking; and there are many other aspects to compare & consider. But, on the matter of Concurrency and Parallelism of a Python runtime which is part of the premise of your question, Python is a single-threaded language runtime which means for every program built by Python, when in running state as a Process, there is only one thread spawned by the Process. As opposed to multi-threaded language runtimes like dotnet runtime, a work could be broken down & shared by many different Threads in parallel. Even though a normal Python program may run as a single-threaded process, in practice, there are multiple options to bring in parallelism, i.e.: ThreadPoolExecutor. The following article discusses these options: Python concurrency and parallelism explained
    • Host & Worker Processes in Azure Functions
      • Architecturally, Azure Functions create 2 processes: Host and Worker processes. This is what enables Azure Functions to support many different languages with scale. When a successfully deployed Python Function App is running, you'll see a process(es) for the Host and one (or more) processes separately for the Python worker process(es) -- depending on the workload and how busy your app is. When the docs talk about worker processes and instances, they are referring to language runtime processes irrespective of regular stateless Azure Functions or stateful Durable Functions. For more about Architectural choices of Azure Functions, be sure to read the following resources:

    If all 3 areas above are clear and you're good with their concepts, the settings below are all that can be thought of as knobs that enable you to control concurrency, parallelism, processes, and threads:

    • PYTHON_THREADPOOL_THREAD_COUNT
      • Because Python is a single-threaded process, you'll likely use the ThreadPoolExecutor to bring in parallelism. That is where the PYTHON_THREADPOOL_THREAD_COUNT setting comes into the picture for your Python Function App -- it enables you to specify the number of threads per Python worker processes base.
    • maxConcurrentActivityFunctions
      • This is specific to Durable Functions & how it relies on an external storage account, known as Taskhub, to keep track of the current state & history of workload. If you're processing a large amount of work via Activity Functions, the work will be registered to TaskHub and the scaleController component will begin allocating more VMs that'll run more Worker processes to speed up the processing. You can use this setting to guide the ScaleController on when to add more Workers.
    • maxConcurrentOrchestratorFunctions
      • This is the same as the above but applies to the Orchestrator functions.
    • FUNCTIONS_WORKER_PROCESS_COUNT
      • This is not specific to Durable Functions but more for the normal stateless Functions. Being mindful of the Host and worker architecture, this setting enables you to specify anywhere between 1-10 number of Worker processes per one Host process.

    Lastly, in addition to the above, I recommend reading the following 2 docs:

    I hope my explanation makes sense and is helpful to your learning journey. Feel free to ask away in the comments below if any follow-up questions.


    Please "Accept Answer" if the answer is helpful so that others in the community may benefit from your experience.


1 additional answer

Sort by: Most helpful
  1. Eric Fan 0 Reputation points
    2024-03-15T06:02:09.33+00:00

    Hi all:

     Very good topic.  I has a confusion about async and thread
    

    As mentioned in the async section, the Python language worker treats functions and coroutines differently. A coroutine is run within the same event loop that the language worker runs on. On the other hand, a function invocation is run within a ThreadPoolExecutor, which is maintained by the language worker as a thread.

    Does above mean async and increasing thread can't be used together? If using async, change PYTHON_THREADPOOL_THREAD_COUNT won't have any impact, no extra thread will be created.
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.