Modifying Rows During a Drupal 8 CSV Migration
Migrate Source CSV is currently the source plugin of choice for doing a CSV-to-Drupal migration with the Migrate API in Drupal 8. In this post I will demonstrate how to manipulate the CSV data in realtime during the migrate:import operation. You can think of this as the equivalent to prepareRow() that you have seen elsewhere, like my blog post Extending the Migrate Plus JSON Parser in Drupal 8.
Please make sure you have a working migration before you begin; it’ll make things easier to troubleshoot if you know you had a good starting point.
Here’s an example of a simple Drupal 8 migration that I’m working on. I removed all of the fields except a few needed to illustrate the solution. This setup migrates CSV data (tab-delimited) from a 3rd party into a “Directory Entry” content type. The machine name of the content type is direntry. The fields in this content type are shown under column_names in the migration YML.
Module Code Before Overriding
The file structure is:
1 2 3 4 5 6 7 |
modules/custom/mysite_migrate/ ├── README.md ├── config │ └── install │ ├── migrate_plus.migration.mysite_direntry_nodes.yml │ └── migrate_plus.migration_group.mysite_directory.yml ├── mysite_migrate.info.yml |
Here are each of the files, before we add our class to override row data:
migrate_plus.migration.mysite_direntry_nodes.yml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
status: true id: mysite_direntry_nodes label: Mysite directory entry nodes migration_group: mysite_directory migration_tags: - Content source: plugin: csv path: 'sites/default/files/_migrate/faculty.csv' delimiter: "\t" enclosure: '"' header_row_count: 1 track_changes: true keys: - 'username' column_names: 0: fullname: 'Fullname' 2: username: 'Username' 3: fname: 'FirstName' 4: lname: 'LastName' process: type: plugin: default_value default_value: direntry langcode: plugin: default_value source: language default_value: 'und' status: 1 title: fullname field_first_name: fname field_last_name: lname field_username: username destination: plugin: entity:node migration_dependencies: {} dependencies: enforced: module: - mysite_migrate |
migrate_plus.migration_group.mysite_directory.yml
1 2 3 4 5 6 7 8 9 10 11 |
id: mysite label: Mysite Directory description: Mysite directory entry migration (and associated migrations) # Here we set a dependency on the module itself. # This is necessary setting that deletes configs from a database # on uninstallation of the module. dependencies: enforced: module: - mysite_migrate |
mysite_migrate.info.yml
1 2 3 4 5 6 7 8 9 |
name: 'Mysite Migrate' type: module description: "Provides Migrate API implementation for Mysite" core: 8.x package: 'Mysite' dependencies: - migrate - migrate_plus - migrate_tools |
Manipulating Row Data
You can manipulate data by changing the file class that the source plugin uses. If you have a look at \Drupal\migrate_source_csv\Plugin\migrate\source\CSV::defaultConfiguration you will find a default file_class of Drupal\migrate_source_csv\CSVFileObject is used. Within this class you will find a current() method that updates the row data to be keyed by column name (as defined in your YML).
To manipulate the row data you must override this current() method in your own file class. Here’s what that looks like:
First, we modify the migrate_plus.migration.mysite_direntry_nodes.yml source plugin configuration to use a new file class.
1 2 3 4 5 6 7 8 9 10 |
source: plugin: csv path: 'sites/default/files/_migrate/faculty.csv' delimiter: "\t" enclosure: '"' header_row_count: 1 track_changes: true file_class: 'Drupal\mysite_migrate\DirectoryEntryNodeCsvFileObject' keys: - 'username' |
Next, we create the file class, which we place in mysite_migrate/src/DirectoryEntryNodeCsvFileObject.php.
Don’t forget to enable your mysite_migrate module and clear the cache so the new class is picked up.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
<?php namespace Drupal\mysite_migrate; use Drupal\migrate_source_csv\CSVFileObject; /** * Defines a CSV file object. */ class DirectoryEntryNodeCsvFileObject extends CSVFileObject { /** * {@inheritdoc} */ public function current() { $row = parent::current(); $row['fullname'] = 'Testing Override Capabilities'; return $row; } } |
All that is left is to test! I recommend testing with a single record. Here’s an example:
1 |
drush mi mysite_direntry_nodes --limit=1 --update |
You can run this as many times as you’d like while you work out the logic for your current() method.
After I run this for the first time, one record is created with a title of “Testing Override Capabilities” as expected, because the fullname column is mapped to the title in the process section of the YML.