Development

Drupal 8 Migrate – Tips and Tricks

This is an ongoing brain dump of quick tips for working with Drupal 8 Migrate. These items may seem true to me as I’m writing them today, but I’m constantly being reminded that things aren’t always as they seem with Migrate. It’s a wonderful system but can be tricky to understand. I recommend reading through my other Migrate-related posts, especially the post on debugging.

Executing

  • If you execute with --update  and the migration fails or you kill it, the next time it runs without --update  it will still have to re-import all records it didn’t get during the first run with --update . So, only use  --update if you’re okay with the system resetting the entire migration (and having to re-import every item in that migration).

Source

  • Using track_changes: true  in the source settings is NOT the same as using  drush mi blog --update . If you have track_changes enabled and you omit the  --update  and it’ll skip any records that haven’t changed but will update those that have.
  • If you need to pull children into paragraphs, and the children are part of the parent data source, it’s easiest to create another migration for the paragraph items and use a custom parser to extract the children out of the main file; this way they get treated as their own data source.

Process

  • Iterator requires an array.
  • Some plugins (even Get) do not return the same data structure for each record. For example, the Xml parser from migrate_plus sometimes returns data with a child wrapper (if there is more than one child), and sometimes returns data without the child wrapper (if there is only one child).
  • If you’re going to use any process plugins on a field, always use the skip_on_empty plugin first.
    • UPDATE: see https://www.drupal.org/project/drupal/issues/2800279#comment-12374627
  • Entity Reference fields only require a target_id. Entity Reference Revision fields (like Paragraphs) require target_id and target_revision_id. I’ve written a process plugin that ensures migration_lookup results have the proper structure.
  • Trying to set select list values (with static_map) to keys that aren’t allowed may result in ignored records.

Dependencies

  • If you have “required” dependencies, the total count of rows for the dependency migration must be greater than or equal to the number of rows in the source for the dependency to be satisfied. See \Drupal\migrate\Plugin\Migration::allRowsProcessed and https://www.drupal.org/node/2797505#comment-12129356.
  • Having duplicate (by key) records in the source will throw off the rows processed vs total number of rows. If, for example, there are 3 records that are identical in the source, and there are 40 rows in the source, you will end up seeing that 40 were processed, but you will only see 37 records in the migrate_map table for this migration. This causes anything depending on this migration to fail with “Missing migrations: migration name here” error.
  • Migrations that are “required” dependencies will be read in and parsed before the dependent migration runs. If you have a very heavy data set this can slow down the migration considerably. If you make the large dependency “optional” the dependent migration will not parse the dependency migration. You would want to be sure to run the dependency migration first, of course. Another option for speeding things up is to make sure your parser isn’t over-processing the data source.

Errors and Ignored Items

  • Errors may be written to the migration’s corresponding migrate_message_xxxxx table. Check there first. If you don’t see anything, consider stepping through each processRow().
  • Items may be ignored for several reasons (e.g., select list having invalid value). Read here for debugging tips: https://stackoverflow.com/a/51597027/1023773 (contents below too)
    • My preference is to use xdebug and PHP exception breakpoints to explore what’s happening when various migrate exceptions are thrown. I use PHPStorm so I use the Run > View Breakpoints menu item to pull up the breakpoints window. I then add the following PHP Breakpoints:
      1. Drupal\migrate\MigrateException
      2. Drupal\migrate\MigrateSkipRowException (the most useful for ignored rows)
      3. Drupal\migrate\MigrateSkipProcessException

      I then execute the migration (before this, you have to be sure PHPStorm is listening and correctly working with xdebug for breakpoints and whatnot). When one of these exceptions is thrown PHPStorm stops the call. I use the Frames window to browse the call stack, looking through the variables over time. This makes it easy to identify the field, and the data (source, destination, row, etc.) at each step of the way.

One Comment

Leave a Reply

Your email address will not be published.