Drupal Migration Kick Start and Cheat Sheet
Drupal migrations! The can be complicated, and even once you get started with them, it’s easy to get confused or lost, or find yourself feeling like you’re swimming upstream in the sea of microtasks and options. As Software Architect at Phase2 I’ve collected a set of resources to help you in your development and running of content migrations for #Drupal 10 and 11.
Drupal core includes a robust Migrate API
Start here to get oriented to what’s included in Drupal core. It’s a lot. You don’t have to read it all right away. Just know that these resources exist and are valuable to return to in the future.
Equipped with that introduction to Drupal core’s migration capabilities, let’s dive into the cheat sheet to help you with your Drupal migration development.
1. Essential contrib
There are several contributed modules that greatly enhance the process of developing and running migrations. Here they are, along with why I find each one helpful.
2. The “Again” script
As you go about developing migrations, you’ll find yourself repeating a whole lot of steps. Import 1. Import 10. Add mappings for a few more fields. Roll back. Import some more. Putting the set of commands that you’re repeating into a simple script file saves a lot of time. And putting set -x at the beginning makes sure that the terminal output still shows what commands are being run.
set -x
MIGRATION=example_location
drush mr $MIGRATION
drush cim --partial --source=modules/custom/example_import/config/install/ -y
drush cr
drush mim $MIGRATION --limit=1 --migrate-debug
3. Find available migration plugins, especially process plugins, and their configuration options.
One of the key tasks of developing migrations is transforming the source data so that it matches the destination Drupal fields. That’s what process plugins are for. They are provided by core, contrib modules, and custom development unique to your site. Here’s a good way to find all of the process plugins currently enabled and available on your site.
Click on the Plugins button beside plugin.manager.migrate.process and you'll see all the available process plugins currently enabled on your site.
Generally how to use these pages:
4. Find drush options
There are many migration options in drush. Knowing what they are and how to find details about them will speed your migration development process.
5. Development workflow
Here’s my typical development workflow, involving editing config files, importing config, testing changes, and capturing final config.
6. Migration Groups
A migration group can be used for setting configuration that is common across several individual migrations.
In this example, there is a migration group that is used for all terms, and imports from a JSON source.
migrate_plus.migration_group.term.yml
id: term
label: Terms
shared_configuration:
source:
plugin: url
data_fetcher_plugin: http
data_parser_plugin: json
urls:
- 'https://www.example.com'
track_changes: true
# Set skip_count to true if the paging through to collect the count starts
# taking annoyingly long.
#skip_count: true
pager:
# The "cursor" pager type uses a token style value from the JSON document
# to use as a param value in the next page request.
type: cursor
# URL param key for next page.
key: pageToken
# JSON: path to key regarding current/next page. Specific are TBD.
selector: response/nextPageToken
item_selector: response/docs
fields:
0:
name: id
label: 'Unique identifier'
selector: '/id'
ids:
id:
type: string
destination:
plugin: 'entity:taxonomy_term'
# Set default_bundle in individual migration
#default_bundle: example
Migrations identify which group they’re part of, and add to the configuration.
migrate_plus.migration.term_condition.yml
id: term_condition
label: Conditions from JSON
# Identify the group to inherit config from.
migration_group: term
# Add config in addition to what is in the group shared_configuration.
source:
fields:
1:
name: name
label: Name
selector: name
2:
name: synonyms
label: Synonyms
selector: taxonomy_synonyms
destination:
default_bundle: condition
process:
name: name
field_original_id: id
field_synonyms: synonyms
Important update, from Benji Fisher : The migrate_tools module now enables shared configuration , similar to how migration groups work. The most notable difference to me though is that with shared configuration, you can include multiple configuration sets in a single migration. This could be very useful for example if all content types use a common set of field mappings, like title, published, etc, and a group of other content types additionally share some common field mappings. Migration groups still work, but shared configuration seems more flexible.
领英推荐
7. Process Examples
The process section of migration configuration is where most of the work is. This is where data is transformed from the incoming source and mapped to Drupal fields. Here are some examples that show essential techniques.
Simple 1:1 field import. Source data format matches exactly the destination.
In this example, the value of the source created property (an integer timestamp) should be imported verbatim into the destination field. So the process mapping can be this one simple line:
created: created
That’s exactly the same as explicitly using the get process plugin:
created:
plugin: get
source: created
Simple chained process
In this example, we need to alter the data slightly, before it is ready for the destination, because the source data sometimes has leading or trailing whitespace.
title:
- plugin: get
source: title
- plugin: callback
callable: trim
Entity Generate / Entity Lookup
In this example, terms are generated in the Search Categories vocabulary, but only if they do not yet exist. Entity Generate extends Entity Lookup, which is how it first determines whether the desired entity exists before creating one.
featured_data:
-
plugin: static_map
source: field_featured
map:
'1': 'Featured'
'2': 'Secondary'
'3': 'On Deck'
default_value: null
-
plugin: skip_on_empty
method: process
-
plugin: entity_generate
entity_type: taxonomy_term
# Taxonomy uses "vid" (vocabulary id) as bundle keys. Nodes use "type".
bundle_key: vid
bundle: search_categories
# Taxonomy uses "name" as the primary value. Nodes use "title".
value_key: name
By the time the entity_generate plugin is reached, in this example, the source value will be the word "Featured" or “Secondary”. The entity_generate plugin will create a term with that name in the defined bundle only if it does not yet exist. It returns the id of the term.
To explore the bundle_key and value_key settings, one way is to look in the database, as the taxonomy_term_field_data and node_field_data tables, and see what the column names are.
Migration Lookup
Use this when one migration needs to set values in a reference field, referencing content that was imported by a different migration. In this example, the current migration needs to reference content created by a migration named “taxonomy_term_topic”.
field_associated_topic:
-
plugin: migration_lookup
source: field_assoc_topics
no_stub: true
migration: taxonomy_term_topic
field_associated_topic is a term reference field, and its values need to be set based on the term IDs from a term migration.
The source is an ID: the original ID of a term.
The migration is the migration that brought in these terms.
The result is the ID in the new site of the imported term.
Media and files import
A common way to import media into Drupal is to use two migrations: one for the basic files (the actual JPGs, PDFs, etc), and one for the media items (for the metadata like a friendly name or tags). A migration lookup is used in the media migration to reference the files from the files migration.
Content migrations that reference this media through media reference fields can use a migration lookup to retain that connection.
In the following example, the source CSV has columns for image_url and name (name of article). And the articles will have image references intact.
File migration.
id: example_file_image
label: Images from CSV
source:
plugin: csv
path: '/var/www/files/example.csv'
ids:
- primary_key
destination:
plugin: 'entity:file'
process:
status:
plugin: default_value
default_value: 1
_destination_uri:
- plugin: skip_on_empty
source: image_url
method: row
# Custom plugin that sets the destination URL.
- plugin: prepare_destination_image_url
uri:
- plugin: skip_on_404
method: row
source: image_url
- plugin: file_copy
file_exists: 'use existing'
source:
- image_url
- '@_destination_uri'
Media migration, looks up files from the file migration.
id: example_media_image
label: Image media from CSV
migration_dependencies:
required:
- example_file_image
source:
constants:
image_alt_prefix: Image for
image_name_suffix: example image
plugin: csv
path: '/var/www/files/example.csv'
ids:
- primary_key
destination:
plugin: 'entity:media'
process:
uid:
plugin: default_value
default_value: 1
name:
plugin: concat
source:
- name
- constants/image_name_suffix
delimiter: ' '
status:
plugin: default_value
default_value: 1
bundle:
plugin: default_value
default_value: image
field_media_image/alt:
plugin: concat
source:
- constants/image_alt_prefix
- name
delimiter: ' '
field_media_image/target_id:
- plugin: skip_on_empty
method: row
source: image_url
- plugin: migration_lookup
source: primary_key
migration: example_file_image
Node migration, looks up image media from the media migration.
id: node_article
label: Articles from CSV
migration_dependencies:
required:
- example_media_image
source:
plugin: csv
path: '/var/www/files/example.csv'
ids:
- primary_key
track_changes: true
destination:
plugin: 'entity:node'
default_bundle: article
process:
# Truncated for brevity.
field_image/target_id:
- plugin: skip_on_empty
source: image_url
method: process
- plugin: migration_lookup
source: primary_key
migration: csv_provider_media_image
Wrapping up
Okay, that was a stack of techniques and info I find myself returning to repeatedly. Do you have any migration shortcuts you could share? I’d love to hear them. I hope this was helpful to you! Please let me know your thoughts and if you might find more posts like this useful in your work. Have a (mi)great day!
Bonus
Bonus tip from Benji Fisher : You will end up with specific and unique questions and use cases. A great place to get answers and discuss issues is the #migration channel in Drupal Slack .
#Drupal #DrupalMigration #Drupal10 #Drupal11
Senior Drupal engineer, artificial intelligence enthusiast, and Acquia Certified Developer
2 个月Well done David!
Drupal developer and core contributor
2 个月Thanks for helping to publicize this information. I suggest two changes: 1. Instead of using migration groups to share configuration, use the newer mechanism from the mograte_tools module: https://www.drupal.org/node/3263258 2. After getting started, people new to the Migrate API (and people who are not so new) will have specific questions. The best place to get help is the #migrate channel in Drupal Slack.
Tech lead at Novicell with 10+ years Drupal experience
3 个月Went through it and looks like solid tips on the most common pitfalls. Large data and long feedback loop can be a killer so this cover this and the use of Devel is also pretty insightful.
Editor In Chief at The Drop Times
3 个月Anish Anilkumar, you were looking for some migration resources, right?
Drupal consultant and core/module contributor. Contact me today for a quote. From site architecture and migrations to module development and debugging, I'll get it done for you either alone or as part of my or your team.
3 个月Also, writing custom migration code isn't that difficult and in many cases it's easier & more maintainable than trying to have the same impact using YAML and contrib plugins. Custom code can also be used to provide better debugging so you can figure out what's wrong with the YAML etc.