Streamline Drupal Content Migration with Automated Transfers Skip to main content

Search

30 Apr, 2019
9 min read

Looking for automated transfer of content to Drupal? Say hello to Migrations.

Image
Looking for automated transfer of content to Drupal? Say hello to Migrations. - Banner

Whether it is migrating from a Drupal 7 website to Drupal 8 or from any other technology (like Wordpress, Joomla, AEM, Sitecore, Typo3, DotNetNuke, etc) to Drupal 8, the most challenging task is the migration of the content. If we were to manually migrate this content, then it will call for a lot of manual effort. And that is never worthwhile and sometimes impossible. Drupal 8 provides an easy solution to this by providing built-in features for migration. Let's take a look at how. I will try and elaborate as much as possible so that this article can become your full guide to migrating content in Drupal 8.

Before we get into the how, let me create a glossary of some terms that I will use throughout.

Migration API is the key to move content from any source to Drupal 8. Migrations are basically ETL processes. ETL is the abbreviation for Extract, Transform Load processes.

Extract: This is the first phase of the extraction of data from a source. Be it a lower version of Drupal website or a non-Drupal one where data may be extracted in the form of CSV, XML etc, the data might be directly supplied from the database of the existing website we wish to update v.i.a source plugins.

Transform: Transform is the procedure of transferring data using the process plugins.

Load: This is used to load data to a destination in our Drupal 8 website.

There is a predefined set of source, process and destination plugin in Drupal 8 system that contains code for fetching data from the source, transforming them to D8 compatible format and loading them to a destination.

 

As already specified, D8 provides built-in features for migration of data from CSV, XML and a lot more. Here is how I used the CSV export from Sharepoint to move/transfer content to a D8 site.

Illustration: Migrating data from a CSV source

Consider the following CSV export from any website that we wish to migrate to Drupal.

books_info.csv

Book ID,Title,Year of publication,Author,Summary
1,In Search of Lost Time v2,2001,Marcel Proust,"Swann's Way, the first part of A la recherche de temps perdu, Marcel Proust's seven-part cycle, was published in 1913. In it, Proust introduces the themes that run through the entire work."
2,Ulysses,2014,James Joyce,"Ulysses chronicles the passage of Leopold Bloom through Dublin during an ordinary day, June 16, 1904. The title parallels and alludes to Odysseus (Latinised into Ulysses), the hero of Homer's Odyss..."

 

Plan of Migration:

  1. A custom module named custom_migration is created in modules/custom/ directory whose dependencies are: 
    • migrate
    • migrate_tools
    • migrate_plus
    • migrate_source_csv
  2. The module contains configuration files (starting with migrate_plus.migration) with ID, Source, Destination and Mapping (Process) within /config/install directory. Enabling this module would generate respective configuration entities for the migration.&
  3. Source data is present in csv format within the asset folder :custom_migration/assets/csv/[.........].csv.
  4. Migrate tool Drush commands are used to manage migration.

Note:

  • The first row in the CSV is the header row. This row is used as the key in the csv for each item to be migrated. It does not contain the actual data. header_row_count in the source section of the migration template is used to specify this.
  • delimiter, in this case, would be ‘,’ since it separates two distinct data.
  • enclosure is necessary to identify text or description. This will be illustrated while explaining the migration of “text long with summary”.

 

Migration template:

Create a migration template namely migrate_plus.migration.book_info.yml

id: book_info
label: Migrate Book Info
migration_group: Books
#source part begins
source:
  plugin: csv
  # Full path to the file.
  path: 'modules/custom/custom_migration/assets/csv/books_info.csv'
  # Column delimiter. Comma (,) by default.
  delimiter: ','
  # Field enclosure. Double quotation marks (") by default.
  enclosure: '"'
  # The number of rows at the beginning which is not data.
  header_row_count: 1
  # The column(s) to use as a key. Each column specified will 
  # create an index in the migration table and too many columns 
  # may throw an index size error.
  track_changes: true
  keys:
    - Book ID
#process plugin contain details of processing
process:
  # title is default title field in Drupal 8 nodes
  # Title is key from csv export that contains data for migration
  title: Title
  field_publication_year: Year of publication
  field_author: Author
  # body needs to be migrated by specifying value, format and summary.
  # this is the way to migrate text long with summary
  # It is obvious that formatted texts have a value and a format part.
  # Body field also contains the summary part.
  'field_body/value': Summary
  field_body/format:
    plugin: default_value
    default_value: full_text
  'field_body/summary': '' 
#target where data is migrated
destination:
  #entity:node plugin is used to migrate to the bundle namely book
  plugin: entity:node
  default_bundle: book
  • Every migration template contains an ID that uniquely identifies each migration template.
  • label is used to provide a display name/label to the template.
  • Multiple migrations may be grouped under a particular head by specifying migration_group. Till here were the basic data that we mention in the script.
  • Next comes the source section. Here we specify details of the source like plugin, delimiter, path to the CSV file, enclosure and so on. Using plugin: csv means there is a plugin named csv in the Plugin>migrate>source directory of migrate_source_csv that contains code to return the configuration in the form of an array using the data specified.
  • The process part contains the individual field mappings. Now each field can be further passed into the plugin to format them in a way that Drupal accepts. Here body field can be considered as an example where the body format field is processed by a plugin default_value. The default_value plugin is used to specify defaults in our template. A list of Plugin can be found here: https://www.drupal.org/docs/8/api/migrate-api/migrate-process-plugins/list-of-core-migrate-process-plugins
  • Finally, we need to specify the destination. Here we have used entity:node Plugin by providing a default value of “book” to it.
  • Note that each migration template is unique and it is important for us to understand as to how we can convert the source data into a form that Drupal can accept in its field types. This conversion is achievable by the use of various Plugins that already exist in several migration modules. If these plugins do not fulfil our requirements we might need to write down a custom one.

Once our template is in place we need to do the following:

  • Enable the module using drush en custom_migration. This creates the corresponding configuration for yml within our install directory.
  • Next, we list down migration using drush ms. This lists down all migration ids along with their status.
  • Run migration by using command drush migrate-import [migration-id]
  • For reverting back we may use drush migrate-rollback [migration-id]

There are several options that can be used with these commands. You may view the entire list using the help command.


Few illustrations:

Migrate to a Link field 

Suppose the source contains data for actual URL(URL) and label/title(uri_title) for the link. Now link field in Drupal has two attributes uri and title. These should be specified explicitly in the template.

...
process:
  'field_url/uri': url
  'field_url/title': uri_title

In case we want a default title for the link fields:

...
process:
  'field_url/uri': url
  'field_url/title':
	plugin: default_value
	default_value: 'Click here!'

 

Migration to paragraphs within nodes

 Suppose we have a multivalued paragraph field that is used to add “contact details” to a node. Now the property type “contact details” contain a Contact name, Contact email

Here we need to keep in mind that the data for each paragraph should be created separately and linked to node from the main migration template of the node.

For this the migration template for the node must contain the following:

...
 #temporary contact to create contact data on the fly via some other migration.
  temp_contact:
    plugin: migration_lookup
    #name of the migration that needs to be used for lookup
    migration: contact
  #To prevent the creation of a stub entity when no relationship is found.
    no_stub: true
    source: ID   
  field_contact:
    #iterator plugin is used for multiple entries.
    plugin: iterator
    #the unique key that will be used to refer to the corresponding.
    source:
      - '@temp_contact'
    process:
      #data for paragraph goes to paragraph revisions.
      #hence both target_id and target_revision_id are needed.
      target_id: '0'
      target_revision_id: '1'

The template for the creation of each paragraph is as follows:

id: contact
source:
...
  #key is important should be a primary identifier
  keys:
    - ID
...
process:
  type:
    plugin: default_value
    #machine name of paragraph type
    default_value: contact
  field_contact_name: Contact name
  field_email_address: Contact Email
destination:
  plugin: entity_reference_revisions:paragraph

 

Migrating data into the address field

 Address field also follows the concept of data insertion to each attribute separately:

#country_code of the field address is assigned a default value of US
 'field_address/country_code':
    plugin: default_value
    default_value: US
#language code has been assigned a default value of en
  'field_address/langcode':
    plugin: default_value
    default_value: en
#Address 1, Address2, City, Zip Code, State are csv data that are inserted to individual parts of an address.
  'field_address/address_line1': Address 1
  'field_address/address_line2': Address 2
  'field_address/locality': City
  'field_address/postal_code': Zip Code
  'field_address/administrative_area': State
  'field_address/family_name':
    plugin: default_value
    default_value: NA
  'field_address/given_name':
    plugin: default_value
    default_value: NA

 

Migration of Boolean type

Suppose a boolean field is created in Drupal 8 and the CSV contains data for this field in the form of TRUE/FALSE under the name Switch. We use a static_map plugin in this case for pre-defining the value that needs to get stored in Drupal 8 field for a particular value in CSV. The map is an array (of 1 or more dimensions) that defines the mapping between source values and destination values

....
  field_boolean:
    plugin: static_map
    source: Switch
    map:
      'TRUE': 1
      'FALSE': 0

 

Migration to Body field or text long with summary type:

...  
  'field_body/value': Copy_Content
  field_body/format:
    plugin: default_value
    default_value: rich_text
  'field_body/summary': ''

 

Migrate to taxonomy entity reference fields:

Let us consider there is data for terms under header “Terms” separated by a comma as follows - Jira, Slack, Agile

First of all, we need to explode the data for Terms column using ‘, ’ as the separator. The separator may be any unique character.

Secondly, we further pass it to plugin entity-generate that generates term on the fly for the vocab/taxonomy namely taxonomy_type. The entity_generate extends from the base plugin entity_lookup and creates an entity if it isn’t present.

The entity_lookup looks up for an entity if it already exists, otherwise, it returns a NULL.

...  
  field_tax_reference:
    -
      #explode is used to split data using a specific separator from csv.
      plugin: explode
      delimiter: ', '
      source: Terms
    -
      plugin: entity_generate
      entity_type: taxonomy_term
      #key for the bundle taxonomy is vid
      bundle_key: vid
      #taxonomy where terms need to be created
      bundle: taxonomy_type
      #name for the terms should contain the data
      value_key: name

 

Migrate to multi-valued plain text: 

A multivalued entry in a field requires our source to contain data for each entry separated by a unique character/symbol within the same column/head. Now in the template, we need to specify the delimiter and explode our source data. In case it a single-valued field we need not use plugin explode.

...
  field_plain_text:
    -
      plugin: explode
      delimiter: ','
      source: Plain text data

 

Migration to date field:

format_date plugin formats a date from from_format to to_format.

 ... 
  field_date:
    plugin: format_date
    from_format: 'n/j/y'
    to_format: 'Y-m-d'
    source: Date

 

The way the migration template is to be written would solely depend on the format of the data obtained from the source and the type of field to feed data into. Migrations involve a well-planned logic to feed data to Drupal while maintaining and creating all its dependencies on the go. This together with a clear understanding of how data is stored in D8 fields will help you automate the entire process of the data migration. Proper understanding of data mapping in terms of entities and fields and the above Drupal migration understanding will set you up for your migration. Moving content from any source to D8 is easy with Migrations now. Say good-bye to tiresome content update and hello to Migrations.