How to Import Data from Spark SQL to Magento 2
Do you know how to import a remote table from
Table of contents
The Common Way of Moving Spark SQL Tables into Magento 2
The common way of moving Spark SQL tables into Magento 2 consists of three steps. Firstly, you should generate an output file from a remote Spark SQL. Note that there are several important requirements to be followed, but we will return to them lately. Secondly, you have to think ahead about import errors caused by different attribute standards: edit the data file replacing third-party designations with the corresponding internal ones. Thirdly, start the update. Let’s take a look at each step individually.
CSV Export from Spark SQL Database
Why CSV, you will ask? Because it is the only supported file format when it comes to the Magento import. You can use the following statement to write the contents of a dataframe in the desired format:
1 |
df.write.csv("/data/home/csv") |
If you need to write the whole dataframe into a single CSV file, use another statement:
1 |
df.coalesce(1).write.csv("/data/home/sample.csv") |
Spark 1.x lets you use
1 2 3 4 5 |
import org.apache.spark.sql.hive.HiveContext // sc - existing spark context val sqlContext = new HiveContext(sc) val df = sqlContext.sql("SELECT * FROM testtable") df.write.format("com.databricks.spark.csv").save("/data/home/csv") |
If you need to write the contents into a single file, use the following code:
1 2 3 4 5 |
import org.apache.spark.sql.hive.HiveContext // sc - existing spark context val sqlContext = new HiveContext(sc) val df = sqlContext.sql("SELECT * FROM testtable") df.coalesce(1).write.format("com.databricks.spark.csv").save("/data/home/sample.csv") |
Map Third-Party Attributes to Magento 2 Attributes
Now, when you have a CSV file with the necessary information, you need to do map attributes. Note that it is a very time-consuming process. The more data your update include, the more designations should be checked and edited. Remember that third-party attributes will cause errors during the import process.
You can also edit attribute values matching external designations to ones used internally. Although they don’t cause issues, you will save tons of time by modifying values in bulk during this stage. Otherwise, you will have to edit values for each product individually via the Magento admin.
Import Spark SQL Table into Magento 2
Finally, you can proceed to the last step of the Magento 2 Spark SQL synchronization and start the import process. Remember that it is enough to use the standard Magento 2 tools to import data from a CSV file to Magento 2. But you will face multiple issues. Since no automation is possible, you have to repeat all the three steps described in this chapter every time a new update is provided. It doesn’t seem that constant manual updates are the most efficient way to spend time, but if you still want to perform the data exchange by yourself, check the following Magento 2 import guide: The Complete Guide to Magento 2 Product Import / Export. It sheds light on various data import details including attributes editing and other nuances of the successful data migration.
Pros & Cons
Let’s see what pros and cons the first approach to data import from Spark SQL to Magento 2 offers.
Pros
- Magento 2 Spark SQL integration.
The only positive aspect of the traditional approach is the completion of our final goal: you can move data from Spark SQL to your Magento 2 website. The default approach lets you establish the connection between two systems, but you face lots of limitations and need to spend tons of time on managing a single update. The process is far from being efficient, intuitive, and user-friendly. The following drawbacks turn it into an unpleasant routine:
Cons
- Three different environments to work with;
- The very limited export functionality of Spark SQL;
- The necessity to work with code;
- Time-consuming attributes editing;
- Attribute values editing in a data file;
- Slow and limited import tools;
- Inability to automate further data transfers.
Such a huge number of drawbacks, isn’t it? The necessity to do so many tasks manually and the inability to automate further import processes was a core reason to look for a better way of importing data from Spark SQL to Magento 2. And below you will find a solution that dramatically enhances the integration.
The Easy Way of Moving Spark SQL Bases into Magento 2
Meet the Improved Import & Export extension – our tool that simplifies every Magento 2 integration. It is the number one time-saver when it comes to import and export processes. The module reduces the complexity of the aforementioned approach to data import to the minimum. And below, we explain how the solution works saving your time and effort.
Spark SQL Database Connection
The first key improvement related to the Magento 2 Spark SQL integration is the ability to connect your ecommerce store to the database directly. Yes, it is no longer necessary to use data files to exchange information between the two platforms. Thus, you reduce the first two steps of the aforementioned approach. Right, there is no need to export CSV files and edit them. With the Improved Import and Export module, you replace the first two steps of the traditional approach to data transfers with more efficient solutions.
Instead of transferring files, you connect to Spark SQL directly creating a data flow between the two systems. Instead of editing attributes in a data file, you get the ability to match them within just a few clicks with the help of mapping presets. At the same time, it is possible to do everything manually if you want. There is a section in the extension’s interface that enables the corresponding functionality. This process is dramatically simplified in comparison to the default manual procedure.
And don’t forget about attribute values editing! The Improved Import & Export Magento 2 extension offers several tools designed to simplify this routine dramatically. You can save tons of time and effort by matching attribute values in the extension’s interface and creating rules to edit them in bulk. Both features are described below.
How to import data from Spark SQL to Magento 2
The data import process from Spark SQL to Magento 2 with the help of Improved Import & Export is very straightforward and intuitive. It consists of the following three steps:
- In your Magento admin, go to the Improved Import & Export extension and create a new import job in the import section.
- Specify the parameters required for the connection to the Spark SQL database, map attributes, create a schedule, and configure other settings if needed;
- Save and launch the profile.
As you can see, the Improved Import & Export extension dramatically decreases the time and effort necessary to connect your Magento 2 website to a remote Spark SQL database. More good news: because the process is fully automated, you prevent further time expenditures as well!
Pros & Cons
The pros and cons related to the Improved Import & Export module and the Magento 2 Spark SQL integration are the following:
Pros
- The most user-friendly connection of your Magento 2 website to the Spark SQL database;
- The direct data transfer between the two systems;
- Eliminated file export/import processes with a more user-friendly replacement;
- No file editing caused by the previous improvement;
- Fast attributes matching with the help of presets and mapping tools;
- Attribute values matching and editing right in the extension’s interface;
- Fully automated data transfers from Spark SQL to Magento 2.
Cons
Take a look at the traditional approach and compare the improved one with it. The difference is huge, but what are the drawbacks of the latter? It is the necessity to install a third-party module. But with our friendly support, both the installation and configuration become a positive experience – not a negative aspect. Contact us, and the Firebear team will help you solve any difficulties related to your integration. And don’t forget that we can implement custom ideas related to the import/export functionality!
The features of the Improved Import & Export Magento 2 extension are described below, but let’s focus on the Spark SQL characteristics before going any further.
Get Improved Import & Export Magento 2 Extension
Spark SQL Facts
Spark SQL is often compared with Hive, HBase, and Spark SQL. Being a component on top of ‘Spark Core’ for structured data processing, it combines features of a relational DBMS and key-value store. Other features are:
- The developer of the database is Apache Software Foundation;
- Its initial release is dated back to 2014; the current one – v2.3.1, June 2018;
- The database is available under the Open Source license;
- No cloud-based features are available;
- The implementation language is Scala;
- Server operating systems are Linux, OS X, and Windows;
- Data scheme and typing features are provided;
- XML isn’t supported;
- Secondary indexes are not represented;
- SQL-like DML and DDL statements are offered;
- APIs and other access methods include JDBC and ODBC;
- Supported programming languages are Java, Python, R, and Scala;
- Server-side scripts and triggers are not available;
- Partitioning methods are supported via Spark Core;
- Replication methods are not available;
- Foreign keys and transaction concepts are not represented;
- Concurrency and durability features are offered;
- No in-memory capabilities and user concepts are represented.
Improved Import & Export Magento 2 Extension
Below, we shed light on the major features of the Improved Import & Export extension and the way it simplifies the Magento 2 Spark SQL integration. Let’s begin with the data transfers automation.
Automated Import & Export Processes
The Improved Import & Export Magento 2 extension offers two ways to automate data transfers between your e-commerce store and Spark SQL. Firstly, it is possible to create an update schedule. Secondly, you can choose events that will be used as triggers launching the update when specific conditions are met.
Schedules
Our extension works with cron providing you with the ability to create any custom update schedule. It is only necessary to use the default cron syntax. Alternatively, choose one of the predefined intervals which are customizable as well.
Besides, the Improved Import & Export extension offers profiles without schedules which are usually used to satisfy the needs of one-time data migrations and asynchronous updates. And note that every scheduled profile can be launched manually whenever you need.
The following gif image shows how to configure the cron schedule section of our tool:
Events
As for the event-based transfers, they can be created on the basis of both system and custom events. Select a trigger and specify conditions to increase its efficiency. For further information, follow this link: How to Run Magento 2 Import or Export After Specific System Event or Process.
Advanced Mapping Features
Now, it is necessary to describe the core mapping features of the Improved Import & Export Magento 2 extension. Let’s start with the easiest way of attributes matching – presets.
Mapping Presets
Our module lets you solve the problem of different attribute standards in a few clicks. Select and apply a preset suitable for your integration to match attributes automatically. Next, the Improved Import & Export plugin will analyze the input data and map it to the corresponding internal designations. The feature is illustrated by the following gif image which shows the migration from Magento 1 to Magento 2 (yes, even the two versions of our favorite e-commerce platform use different attributes):
Matching Interface
If there is no preset for your particular integration or you want to do everything by your own hands, map attributes manually via a corresponding interface. Select a third-party designation and specify the internal one in front of it. That’s all! The following gif shows how intuitive and user-friendly the process is:
As for the third column, it is designed to determine hardcoded attribute values. Set a new attribute value, and it will be added to every item that uses the attribute.
Attribute Values Mapping
As for attribute values mapping, it is also possible with the Improved Import & Export Magento 2 extension. The process incorporates the same principle as the attributes matching described above. Select a third-party designation and specify the internal one in front of it. Thus, you will save tons of time in the future. For further information about the feature, follow this link: Attribute Values Mapping.
Attribute Values Editing
To increase the efficiency of the previous feature and your routine related to attribute values, we’ve also introduced the ability to edit them in bulk. This time, you don’t need to do that in a data file. Instead, the Improved Import & Export Magento 2 extension offers the dedicated section where you can apply various rules. Firstly, you can add a prefix to multiple attribute values. Secondly, you can edit them by specifying a suffix. Note that both rules can be applied simultaneously.
Thirdly, you can split multiple values within a few clicks if they are merged. Fourthly, the opposite process is possible as well. The following article explains attribute values editing in more detail: How to Modify Attribute Values During Import and Export in Magento 2.
Category Mapping
And the Improved Import & Export Magento 2 extension simplifies your routine related to catalog editing. While moving product data from external systems, you may face lots of different categories unrelated to yours. Products are usually bound to them, so you can import unnecessary data that affects the default order of your storefront. Luckily, our module offers a reliable solution: you can match external categories to the ones used internally right during the import process. The following gif image shows how to do that:
And you can create new categories on the fly. Select a parent category and specify a name of a new one. That’s it! Both features are described here: Category Mapping.
Attributes On The Fly
If your data file lacks any attributes, can create them on the fly. The Improved Import & Export Magento 2 module enables you to specify parameters necessary to do that via the following general form:
Attribute|attribute_property_name:attribute_property_value|…
You can find the full process of attributes on the fly creation here: Product attributes import.
Extended Connectivity Options
Although the Magento 2 Spark SQL integration is based on the direct connection, the Improved Import & Export Magento 2 extension provides multiple more ways to integrate your e-commerce store with external systems.
Multiple File Standards
As you might have already guessed, our plugin substantially increases the number of supported file formats in comparison to the default Magento platform. In the case of the default import tools, you should generate a CSV file from the Spark SQL table due to the native limitations. When it comes to the Improved Import & Export Magento 2 extension, it supports XML, JSON, ODS, and Excel. You can transfer data between your store and external systems using these formats and applying all the extension’s features to them.
Multiple File Sources
It is also necessary to say a few words about support for numerous sources. The Improved Import & Export extension offers the following three options:
- FTP/SFTP. You can exchange data between Magento 2 and Spark SQL using the FTP/SFTP connection which allows utilizing a local or remote server as a point of transfers.
- Dropbox. The same functionality is possible with a Dropbox account.
- URL. When it comes to moving data to your store, our plugin enables you to transfer files via a direct URL.
And don’t forget that Improved Import & Export unpacks file archives automatically. So if you’ve created a compressed output from a remote database, you no longer need to unzip it manually – the plugin will do everything for you.
Alternative Ways of Import & Export
There are also alternative ways of data transfers. Our module offers several robust options in this area. Firstly, it enables you to rely on various intermediaries to create a bridge between your Magento 2 website and external platforms. Secondly, an API connection can be used for the same purpose. Currently, the extension supports Google Sheets, Office 365 Excel, and Zoho Sheet integration. The former is illustrated below:
As for the second connection type, Improved Import & Export lets you integrate with external systems directly on the basis of API. The extension supports REST, SOAP, and GraphQL integrations to synchronize the data flow.
Watch the following video to see the extension in action:
Final Words
It’s obvious that the Improved Import & Export dramatically improves the way you transfer data not only between your e-commerce store and Spark SQL but also between Magento 2 and any external systems. You can fully automate all transfers making your daily routine less obtrusive. For further information about the Magento 2 Spark SQL integration on the basis of Improved Import & Export, follow this link and contact our support:
Get Improved Import & Export Magento 2 Extension