Scale Complex Processes With AWS Step Functions
The Pain
The implementation of digital processes can be very complex. Here, the complexity is found on the one hand in the implementation and on the other hand in the operation and scalability.
In the implementation, the complexity increases due to the number of process steps and especially due to the branches and error handling. As a monolith, the process quickly becomes confusing, and the increasing number of branches makes it unclear whether all possible states and error cases are adequately processed.
In operation, the scalability and error-proneness of the solution can be particularly problematic. The execution time of the processes can be several hours, days or months, depending on the use case. It is particularly important that execution continues without errors, even when servers or containers are maintained, added or removed. In terms of scalability, the process should grow with the business. For example, a billing process should also be able to handle rapidly growing user or consumption data.
AWS Step Functions
AWS Step Functions is a visual workflow service used to orchestrate services and automate processes. State machines can be created that consist of tasks, branches, parallelizations, error handling, etc. State machines are particularly suitable for the implementation of processes, as the individual states, actions via transitions are defined and visualized more precisely.
Step Functions helps to break down processes into small building blocks that are interconnected. For example, tasks can be executions of serverless AWS Lambda functions that read consumption data, calculate prices, generate invoices, etc. Transitions are used to connect the tasks together. By default, the output value of one task is the input value of the following one (Input and Output Processing). Furthermore, branches, pauses, parallel executions and loops can be implemented to represent arbitrarily complex processes.
The state of the execution is managed by Step Functions at any time. The execution can run up to one year without a timeout. Many millions of executions can be run in parallel in a fully scalable manner. Executions can be visually inspected via the AWS Management Console. This is especially helpful for debugging, as errors and the respective inputs and outputs can be viewed directly.
Step Functions also provides many built-in functions that are often needed when implementing processes. These include retries and error handling functions.
Advantages of AWS Step Functions
Key benefits of AWS Step Functions include high scalability combined with the pay-as-you-go pricing model. The defined Step Functions scale easily with growing requirements. Since there is no base cost associated with the service, using AWS Step Functions is especially worthwhile for processes that are launched occasionally, such as monthly billing.
Breaking processes into individual process steps reduces complexity and increases maintainability of source code. Developers can implement decapsulated process steps independently. The visualization of the step function also makes all steps and transitions visible to the business department. In addition, the step function is validated during deployment. Incorrect transitions or unreachable tasks are thus identified at an early stage.
If you use serverless services such as AWS Lambda, you also benefit from the high execution time of the processes. While Lambda functions time out after 15 minutes, standard step functions can execute for up to a year.
The visual representation in the AWS Console also makes all transitions and possible states easy to see, even for non-technical business users. Past executions can be visually inspected and tracked. This makes it much easier in the DevOps space to identify errors in processes.
Step Functions also provides powerful troubleshooting capabilities. For example, failed tasks can be restarted using simple parameters. The maximum number of execution attempts and the (random) waiting time between attempts can be configured via parameters.
External worker steps can be embedded in the state machines via activities. This can be used, for example, to implement manual release processes. A worker program can call up new executions in which a release is pending and inform the specialist department, for example. The result of the step can then be sent for the execution at a later time using a task token, for example by a person releasing the further execution the next day via a web interface.
Use Cases
An everyday example of the use of AWS Step Functions would be an order placed through an online store. Figure 1 presents a minimalistic process. First, a function checks if the desired items are in stock. If the goods are available, first the order is created/updated, an invoice is generated, the invoice amount is collected, and finally, the confirmation is sent. If the goods are not available, the customer is informed.
Another example is the monthly customer billing of SaaS companies. In the example, the invoices are created at the beginning of the month. The state machine is started automatically at the beginning of the month. In the first step, the customers that need to be billed are loaded. The necessary steps are processed in parallel via a map. It is possible to set the maximum number of executions that are processed simultaneously. The price is then calculated for each customer, and the invoice is generated and sent.
The processes shown have been minimized for better understanding. Often, significantly more tasks and branches are required. It is helpful when using AWS Step Functions that tasks can be Step Functions themselves. For example, the Generate Invoice task can be its own process consisting of steps such as Generate Invoice Number, Create Invoice, Generate PDF, File PDF, etc.
Summary
AWS Step Functions is a very helpful service to implement digital processes. Its use is useful in many areas, especially when dealing with complex processes.
In development, organizations benefit from Step Functions by reducing complexity and increasing maintainability. The process is cut into small tasks that can be implemented independently. Visualization makes all states and transitions easy to see.
In operation, companies benefit from the high scalability and the pay-as-you-go pricing model. Especially processes like monthly billing, which are executed occasionally, can be operated scalably for a few cents.
How do you implement digital processes in your company and what are the advantages and disadvantages?