Cancel
Development, Drupal, Top Drupal Development Agency

Drupal 8 Batch Processing

by John DoyleJanuary 2, 2018
Share

As one of the top Drupal firms in the market, we get a lot of questions around Drupal 8 and its broad range of functionality, including Drupal 8 Batch Processing. We thought to start out the new year, we would offer our primer on Drupal 8 Batch Processing.

What is a batch job?

A batch job or batch processing is the execution of a series of jobs in a program on a computer without manual intervention (non-interactive). Strictly speaking, it is a processing mode: the execution of a series of programs each on a set or “batch” of inputs, rather than a single input.

In English, this means that it allows a computer program to break up a series of tasks into smaller chunks or pieces that run without any manual intervention to trigger.

When would I want to use this?

Drupal 8 Batch Processing jobs are valuable to use when there could be large amounts of data or long processes running that utilize a significant amount of memory. An example would be regenerating all URL aliases on your website. The “pathauto” module sets up a batch process when doing this to regenerate 25 aliases at a time, instead of trying to regenerate an entire site (think 5,000 – 500,000 entities) at one time that might cripple the system.

Why would I want to use this?

Performance & Scalability are the biggest reason to utilize Drupal 8 Batch Processing in your development. Batch jobs allow the processing of large amounts of data without relying on a single process to complete the task from start to finish in a single execution. This allows your server resources to be utilized in smaller chunks and freed up after each batch execution finishes.

Here are some questions you can ask when determining if you might need to create a batch process:

  1. Does the action I need to perform against these items have a per-item resource cost?
    • If the action your performing requires loading or processing of each item individually, you should be looking to use batch processing to handle it. If you are performing a simple task, such as a bulk DB query that impacts all nodes in your database, it may not be required.
  2. Do I need to perform an action on a large number of entities?
    • If the answer is yes, than you will likely gain significant performance benefits by utilizing batch processing to work through your task.
  3. Is there a finite set of data that I am performing actions on or can the dataset grow?
    • If you are unsure about how big your data set will get, you should strongly consider batch processing. Not planning for this upfront could cause site downtime and lots of headaches later down the road.
  4. Even if your current data set is small, can it expand?
    • For example, maybe your site only has 30 nodes at the moment, but that number will increase in the future. If this is the case, or you are building a module that you may want to contribute back to the community, you will likely want to look at batch processing as an option for handling this action.

How do I do this?

Creating a batch process in Drupal 8 is relatively straightforward. Here is what you will need to get started:
Demo_batch

  • src/Controller/DemoBatchController.php
  • demo_batch.info.yml
  • demo_batch.routing.yml
  • demo_batch.mybatch.inc

Demo_batch.routing.yml

The routing file defines a route, the Controller to be used, and the requirements to use it.

DemoBatchController.php

The controller tells Drupal what to do when the route (defined above) is accessed. In this case, we are creating a Batch Controller which will handle the processing of the batch job.

demo_batch.mybatch.inc

This includes file provides the callback functions for the controller to handle execution of the job. In this example, we are running through a migration task.

Looking for help with your Drupal 8 development? Contact Us to find out how we can help.

Frequently Asked Questions (FAQ)

What is Drupal batch processing in plain English?

It’s a way to split heavy, repetitive work into bite-sized tasks that run automatically until the job is done. Instead of trying to regenerate 50,000 URL aliases at once, Drupal processes a small chunk, frees resources, and queues the next chunk. You get reliability and responsiveness while long jobs complete behind the scenes.

When should I reach for a batch instead of a one-off script?

Use it anytime the operation scales with the number of entities or has per-item processing cost-migrations, alias rebuilds, file operations, or recalculations. If your dataset can grow unpredictably, plan for batches from day one. Single-shot operations risk timeouts, memory exhaustion, and unhappy users.

How does batch processing improve performance and stability?

Batches bound memory and execution time per step, allowing PHP and web servers to recover between iterations. They also provide progress feedback and resumability if a request fails. This design plays nicely with shared hosting limits, cron, and admin UX so big jobs don’t bring the site to its knees.

What are the basic pieces I need to implement a batch in D8?

Define a route and controller to kick off the work, and implement callbacks that load items, process each subset, and track results. You’ll typically organize code into a module with routing YAML, a controller class, and an include/service file that handles the processing logic. Keep responsibilities clean and test with small datasets first.

Any practical tips for designing a safe batch job?

Cap the chunk size conservatively-opt for smaller batches and iterate up. Log progress and errors, and make processing idempotent so reruns don’t corrupt data. Provide an admin UI with a clear description and a dry-run option if possible. Always try it on a staging copy before touching production content.

What's a common real-world use case?

Regenerating or re-mapping URL aliases via Pathauto is a classic example. Rather than rewrite thousands of paths in one go, the batch handles, say, 25 at a time, respecting server limits and providing visible progress. The same pattern applies to migrations where you process entities in manageable slices.