Magento 2 Bulk API & Asynchronous Import

Magento 2 Bulk API & Asynchronous Import

Below, we discuss the Magento 2 bulk API and the corresponding asynchronous import processes. After a short introduction to the topic, we shed light on the preconditions and history of the asynchronous Magento 2 bulk API. You will find out what new endpoints are used, how asynchronous responses work, what role bulk APIs play in system processes, what status endpoints are utilized, why Swagger is important, etc. After that, we discuss the performance of bulk APIs in Magento 2 as well as the corresponding endpoints and the repeated features, including routes, payloads, responses, delete requests, store scopes, and object update/generation. Finally, the article switches to asynchronous import processes and product creation via bulk APIs

'

Introduction

Although Magento 2 got tons of significant improvements in comparison to its first version, its new API is one of the most notable enhancements. Not only developers can leverage its benefits but also other users, including merchants, marketers, and backend administrators. With the new Swagger documentation, we got the REST interface that is even more practical and user-friendly whenever before.

The new Magento 2 Bulk API now covers numerous areas, including the leading data transfer flows. Thus, you can leverage the technology to create and update products via asynchronous import. Both simple and complex items support automation. At the same time, the platform provides the ability to add new information for categories and customers. The Magento 2 Bulk API even lets you control the entire checkout process.

Preconditions

Back in 2018, Magento did a huge step forward with its APIs. However, there was a bottleneck that led to severe performance and scalability issues. The system didn’t have an optimized Bulk API for catalog import. As a result, connecting to ERP systems was problematic. The more products the website had and the more frequent were the updates, the more complexity it’s owners faced.

Creating a scalable Bulk API became the number one challenge. The new mechanism should be powerful enough to handle the platform’s numerous calculations and database operations related to such features as tier and group pricing, multiple stores and currencies, catalog and cart price rules, etc. And don’t forget about an observer and plugin system that forced developers to rely upon default events while adding their custom logic.

A community-driven project was started on Github to solve all these and other issues. Its core goals were:

  • API that supports multiple products that are persisted a time;
  • Products coming into Bulk API are persisted without deadlocks;
  • The Bulk API persistence is optimized for performance and remains backward-compatible with current customizations;
  • A client can send multiple Bulk API requests in parallel;
  • Magento can process multiple Bulk API workers in parallel;
  • Support for multiple entities;
  • Quick HTTP requests;
  •  Asynchronous Bulk data processing.

The History of The Asynchronous Magento 2 Bulk API

For almost a decade, the Magento platform lacked a native way to connect a webshop to external ERP systems. Everyone thought that the launch of Magento 2 would solve the problem due to its renewed API coverage. However, several projects revealed the unreadiness of the system to support the challenging needs of its users. Even though the new API unveiled a significant superiority over its previous versions, some problems related to the ERP integration occurred. 

Some external platforms cannot send requests to Magento one-by-one, providing the e-commerce platform with all objects simultaneously. The system has to receive thousands of requests via an API endpoint. As you might have guessed, nothing good happens in such situations.

Comwrap proposed to create a new layer between the requests to the Magento 2 REST API and a RabbitMQ queue. The middleware caught all the requests and forwarded them to the queue. On the next stage, it read the returning messages from RabbitMQ, executing them one by one. 

Another enhancement was to add a log of all actions. As a result, it was possible to backtrack all the errors and spend less time solving them.

However, Comwrap was not the only company that suffered from the API limitations of Magento. Another developer that faced problems like that was Balance Internet. Furthermore, the company proposed a similar solution. Both wanted to contribute to the implementation, making Magento 2 more powerful and user-friendly.

Let’s see what changes were implemented

New Endpoints

The project’s main goal was to change the way Magento perceive different calls. An Asynchronous API added REST URL routes providing the platform with the ability to understand what call is synchronous and what one is asynchronous. Besides, the existing Magento Web APIs should support asynchronous mode automatically. The first algorithm included the following steps:

  1. Asynchronous calls are sent to the message queue. 
  2. A consumer reads them from the queue
  3. The calls are executed one-by-one. 

However, it was necessary to find a solution to distinguish between the two call types. Comwrap developers proposed a new prefix. The idea was to separate asynchronous calls from synchronous ones with async, like in the example below:

As for other API requests, they remained unchanged. The same was about parameter logic and request body, which were transferred similarly to synchronous requests. The image below illustrates the complete workflow:

Magento 2 Bulk API & Asynchronous Import

Asynchronous Responses

Now, Magento creates a bulk operation for every asynchronous request. It is an entity that aggregates multiple operations. Besides, the entity tracks the aggregated status of requests. 

Thus, a bulk for a single asynchronous request consists of a sole operation. In its turn, an asynchronous request results in a generated Bulk UUID, which Magento returns. Developers leverage it to track the status of bulk operations.  

Bulk API

Now, let’s explore another use-case for Magento 2 asynchronous APIs. An external platform sends a large number of entities related to the same endpoint. With the old API, it would be necessary to invoke the endpoint multiple times. However, Comwrap developers proposed a more efficient algorithm. They offered a particular type of API endpoints – Bulk API. How did it change the game? 

  1. Bulk API combined multiple entities related to the same type into an array
  2. The array participated in a single API request. 
  3. Next, the endpoint handler split the array into individual entities. 
  4. After that, it sent them in separate messages to the Message Queue.

Since the core AsynchronousOperations Magento module was incorporated even before the contribution, implementing Bulk API was not challenging at all. However, it was necessary to achieve the correct handling of Web API requests. Developers proposed new routes: 

A list of operations statuses was added to the response: 

The new approach opened new possibilities for POST, PUT, or DELETE requests. From that point, developers were able to execute them like a bulk operation.

URL Parameters

It was also necessary to discover the best way to handle placeholders for parameters inside endpoint URLs. Since requests were converted, parameters like “sku” and “optionId” became not required as input. Also, note that for Bulk API requests, it was possible to generate the URL path by replacing the colon “:” with the prefix “by” like int he example below:

Status Endpoints 

Several calls were added to enable operation status tracking. They also introduced real operation responses so that it became possible to track progress or discover errors. Everything was based on the Bulk UUID that became the primary search parameter.

Now, you need to get the Short Status to find out the operation status. The request looks as follows:

When it comes to the Detailed Status, this request returns a response about operations status. At the same time, you receive detailed information about each operation. The corresponding request looks as follows:

Swagger

A Swagger schema was another crucial feature that enabled asynchronous operations as they are now. The Magento 2 system offered documentation for the asynchronous endpoints in the form of Swagger UI. As a result, it implemented the ability to outline input and output types.

Results

Of course, the Magento Message Queue and the Rabbit MQ integration framework were implemented in Magento Commerce first. At a particular stage, they were transferred to Magento Open Source. After three months of development, the project was completed with four pull requests, covering all the improvements described in this chapter. All the benefits became a part of Magento 2.3 and its next versions. However, it was not the end, since the bulk and asynchronous API still required many more things to do:

  • Redis framework for message statuses;
  • Strictly-type status messages;
  • Various message queue framework improvements, etc.

Check the backlog to see what is already implemented. Check the full article for further information: Comwrap Collaborates with Magento on New Asynchronous Bulk API.

Magento 2 Bulk API Performance

magento 2 bulk api asynchronous import

OF course, the implementation of the Magento 2 Bulk API not only enabled new possibilities related to the integration but also increased the platform’s performance. The elapsed time of the Asynchronous API is a bit less than the Sync API since the asynchronous approach allows saving time during item creation. However, each API request requires initializing the entire Magento instance. Besides, the difference between synchronous and asynchronous APIs depends on batch size. The time consumed by Async vs. Sync API is a bit less on big batches, but more significant on smaller ones.

As for the time of Bulk API, it is constant and almost independent from the number of items since he queueing of each item takes time for writing an operation status into the database and queuing a message in Rabbit MQ. As a result, the performance of Bulk API can be several times higher on big batches: 

  • 30% decrease in total time in comparison to Sync and 41.5%  – in comparison to Async for 100 items;
  • Up to 84% (Sync) and 80% (Async) for 1000 items.

However, on small batches that may contain 1 product, both Async and Bulk asynchronous methods show less efficiency than Sync. The latter consumes less resources. There are situations when a small batch sent via one of the Asynchronous methods spend some time in the RabbitMQ Queue before processing begins. 

The performance testing illustrates the following tendency: the bigger the number of items is, the higher the performance improvement you get with Bulk API. Thus, the implementation of Bulk API in Magento 2 was a revolutionary idea, that was entirely worth the time and effort spent. You can see the detailed results of all performance tests here: Asynchronous Bulk API Performance Test.

Magento 2 Bulk API Endpoints

Now, let’s see how the official documentation describes the new Magento 2 Bulk API. The new technology differs from other REST endpoints due to the ability to combine multiple calls of the same type into an array. This array is then executed as a single request (despite the number of calls). The system uses the endpoint handler to split it into individual entities. Next, it writes them as separate messages to the message queue.

As you can see, the initial idea was implemented without any changes. However, you must install and configure RabbitMQ before using the Bulk API. Otherwise, you won’t process messages more efficiently.  After the tool is installed, use the following command to start the consumer responsible for asynchronous and bulk API messages:

Routes

Another moment that didn’t change since the initial implementation of the Bulk API idea is the prefix. You need to add /async/Bulk before /V1 of a synchronous endpoint route to call a Bulk endpoint (POST /async/bulk/V1/products).

Endpoint routes with input parameters still require additional changes. You need to replace the colon (:) with by. Also, edit the first letter of the parameter, changing it to uppercase.

For instance, we have the following Synchronous route: 

To create a corresponding Bulk role, we need to add the prefix – async/bulk/ – and edit input parameters – :sku and :entryId. The new parameters will look as follows:

  • bySky
  • byEntryId

Thus, the complete bulk route gets the following appearance:

Payloads

It is also worth mentioning that a bulk request payload contains an array of request bodies.

Responses

When it comes to the response, it contains an array that shows whether the call added each request to the message queue successfully or not.

DELETE requests

And you can use the following call to deletes CMS blocks asynchronously:

Store scopes

It is possible to specify a store code in the route of an asynchronous endpoint. As a result, it will operate on a specific store instead of a default one. For instance:

If you want to perform operations on all existing stores, specify the all store code like shown below:

Creating/updating objects

Note that there are several rules to follow when you create or update objects using the Magento 2 Bulk API. If you’ve occasionally missed specifying a store code when creating a new product, Magento will create a new object with all values and set it globally. However, if it happens when you update a product, the values are updated for the default store only. To update values for all store scopes, use the “all” parameter. Use the “<store_code> parameter to update values for the defined store only.

For more code examples and other nuances, read this article: Magento 2 Bulk API Endpoints.

Asynchronous Import

Now, when we’ve briefly described what the Magento 2 Bulk API is and how to use it, we can proceed to the Magento 2 Asynchronous Import. The idea behind this project was to create Web API support for importing different types of data into Magento. First of all, developers wanted to replace the existing module, so they have to recreate its functionality considering asynchronous opportunities. Thus, the new extension covered all import features that were already represented on the platform.

The developers decided to create a module distributed within the Magento/AsynchronousImport* directories. Another goal was to follow technical guidelines and recommendations of the Service Isolations approach. Unique business logic should be prepared for each module. And, what is even more important from the perspective of our article, renewed import processes should be run asynchronously via the Magento 2 Bulk API following these steps:

  1. A user uploads an import data file (CSV or other formats);
  2. The Asynchronous Import extension receives and validates the file;
  3. Next, the module returns a File UUID to the user;
  4. The user applies custom parameters if applicable using the File UUID;
  5. After that, the module implements the following actions:
    1. parses the file, 
    2. splits it on single messages,
    3. sends the messages to the Asynchronous Bulk API of Magento 2;
  6. After the import is complete, it is possible to request the import status and resubmit failed objects (if there are any).

The development was split into two stages. During the first one, it was planned to implement endpoints to receive files, start processing, and receive statuses. The second phase was about building a UI for the Asynchronous Magento 2 import. Developers planned to create a separate Magento extension that would utilize Bulk API to communicate with the Asynchronous Magento 2 import module. You can find the project here: Async Import Wiki.

Were these goals implemented? The Asynchronous Import is already a part of the Magento 2 core, so the goals were achieved at least partly. There is still room for improvement; however, you can always leverage an alternative solution. Currently, we are working on the implementation of the Message Queuing on Rabbit MQ and support for Bulk API import in Improved Import & Export extension. The module already supports all the core Magento 2 entities, multiple file formats, API and Google Sheet data transfers, as well as robust manual or preset-based mapping capabilities. At the same time, even with no support for the Asynchronous Bulk API, the plugin shows high performance even for files with thousands of records. Follow the link below for further information:

Get Improved Import & Export Magento 2 Extension

And don’t miss our Complete Guide to Magento 2 Product Import / Export.

Creating Products via Bulk APIs

You can use the Magento 2 Bulk APIs not only for import/export purposes but also for creating multiple customers and products, updating prices, and assigning numerous products to a specific warehouse in Bulk. Due to the Bulk APIs, all these actions are considered a single call.

The official Magento 2 documentation includes a tutorial that explains how to leverage the Magento 2 Bulk APIs. The material teaches how to create a configurable product. This product type is chosen since a configurable product is a parent product of multiple simple products. It results in a situation when a buyer must make at least one choice (usually, there are more choices) to add a product to the cart. 

For example, shoes come in a variety of sizes. If you are offering a model of snickers in five sizes, then your configurable product consists of five simple products. If there are two colors of each size, then you have to deal with ten simple products (10 different combinations).

As for the tutorial, it describes how to create a t-shirt that comes in three sizes but one color. The following steps are described:

  1. The configurable product with basic characteristics;
  2. A simple product for each size;
  3. The connection between simple products and the configurable one;
  4. An option that allows specifying a custom text on the shirt.

You can find the tutorial here: Create a configurable product using Bulk APIs.

Magento 2 Asynchronous API Import FAQ

How to implement asynchronous import and export in Magento 2

With the help of the Improved Import & Export Magento 2 extension, you can leverage asynchronous import & export processes on your e-commerce website. The module lets you import tables with thousands of products right from your Magento 2 admin panel in a couple of minutes. Furthermore, you can rely on the extension to create a schedule of updates so that the plugin transfers data within the specified intervals automatically.

How to import products to Magento 2 via APIs asynchronously?

The Improved Import & Export extension lets you transfer data via APIs asynchronously. Since the module supports all the core entities used in Magento 2, you can import products to Magento 2 via an API connection, moving information from the connected source automatically. Create a REST or SOAP API request to import XML or JSON files with product data to your Magento 2 installation. You only need to specify an API Call URL, set the request options and body, as well as configure several other parameters to import products to Magento 2 via APIs.

How to import customers to Magento 2 via APIs asynchronously?

The procedure is basically the same: the Improved Import & Export extension provides the ability to transfer customer data via APIs. You need to create a REST or SOAP API request to import XML or JSON files with customers to your Magento 2 installation. You only need to specify an API Call URL, set the request options and body, as well as configure several other parameters to import customer data to Magento 2 via APIs.

How to import orders to Magento 2 via APIs asynchronously?

The Improved Import & Export extension is also helpful when it comes to orders. The module provides the ability to transfer them via APIs. You need to create a REST or SOAP API request to import XML or JSON files with order data to your Magento 2 installation. You only need to specify an API Call URL, set the request options and body, as well as configure several other parameters to import orders to Magento 2 via APIs.

How to export products to Magento 2 via APIs asynchronously?

The Improved Import & Export extension lets you both import and export data via APIs asynchronously. The situation around export processes is similar to the one we’ve just described: the module supports all the core entities used in Magento 2, so you can use it to export products to Magento 2 via an API connection. Integrate your e-commerce website with CRM, ERP, PIM, and other systems to export data to them. Freely transfer products from Magento 2 via API connections.

How to export orders to Magento 2 via APIs asynchronously?

You can also use the Improved Import & Export extension to export orders from Magento 2 via API, integrating your storefront with an ERP platform or other systems.

How to export customers to Magento 2 via APIs asynchronously?

You can also create a new export job in the Improved Import & Export extension to export customers from Magento 2 via APIs asynchronously.

What other entities can Improved Import & Export transfer via API?

With the Improved Import & Export Magento 2 extension, you can create API connections to transfer all entities the module supports. In addition to products, orders, and customers, the module lets you use API connections to import and export Categories, Advanced Pricing, Product Attributes, CMS Pages & Blocks, Catalog & Cart Rules, Gift Cards, Reviews, URL Rewrites, Search Terms & Synonyms, Widgets, Page Hierarchy, Newsletter Subscribers,etc.

How to import and export data in Magento 2 via APIs automatically?

Install the Improved Import & Export Magento 2 extension to transfer data in Magento 2 via APIs automatically. The module supports both asynchronous APIs and cron jobs. As a result, you can create a schedule of updates transferring information to/from your Magento 2 installation in a fully automated manner. Just specify an interval to launch data transfers when creating a new import or export job.

'