Migrating users into Drupal – Part 2

Today we complete the user migration example. In the previous post, we covered how to migrate email, timezone, username, password, and status. This time, we cover creation date, roles, and profile pictures. The sourcedestination, and dependencies configurations were explained already. Therefore, we are jumping straight to the process transformations in this entry.

Example field mapping for user migration

Getting the code

You can get the full code example at https://github.com/dinarcon/ud_migrations The module to enable is `UD users` whose machine name is `ud_migrations_users`. The two migrations to execute are `udm_user_pictures` and `udm_users`. Notice that both migrations belong to the same module. Refer to this article to learn where the module should be placed.

The example assumes Drupal was installed using the `standard` installation profile. Particularly, we depend on a Picture (`user_picture`) image field attached to the user entity. The word in parenthesis represents the machine name of the image field.

The explanation below is only for the user migration. It depends on a file migration to get the profile pictures. One motivation to have two migrations is for the images to be deleted if the file migration is rolled back. Note that other techniques exist for migrating images without having to create a separate migration. We have covered two of them in the articles about subfields and constants and pseudofields.

Migrating user creation date

Have a look at the previous post for details on the source values. For reference, the user creation time is provided by the `member_since` column, and one of the values is `April 4, 2014`. The following snippet shows how the various user date related properties are set:

created:
  plugin: format_date
  source: member_since
  from_format: 'F j, Y'
  to_format: 'U'
changed: '@created'
access: '@created'
login: '@created'

The `created`, entity property stores a UNIX timestamp of when the user was added to Drupal. The value itself is an integer number representing the number of seconds since the epoch. For example, `280299600` represents `Sun, 19 Nov 1978 05:00:00 GMT`. Kudos to the readers who knew this is Drupal’s default `expire` HTTP header. Bonus points if you knew it was chosen in honor of someone’s birthdate. 😉

Back to the migration, you need to transform the provided date from `Month day, year` format to a UNIX timestamp. To do this, you use the `format_date` plugin. The `from_format` is set to `F j, Y` which means your source date consists of:

  • The full textual representation of a month: `April`.
  • Followed by a space character.
  • Followed by the day of the month without leading zeros: `4`.
  • Followed by a comma and another space character.
  • Followed by the full numeric representation of a year using four digits: `2014`.

If the value of `from_format` does not make sense, you are not alone. It is actually assembled from format characters of the `date` PHP function. When you need to specify the `from` and `to` formats, you basically need to look at the documentation and assemble a string that matches the desired date format. You need to pay close attention because upper and lowercase letters represent different things like `Y` and `y` for the year with four-digits versus two-digits respectively. Some date components have subtle variations like `d` and `j` for the day with or without leading zeros respectively. Also, take into account white spaces and date component separators. To finish the plugin configuration, you need to set the `to_format` configuration to something that produces a UNIX timestamp. If you look again at the documentation, you will see that `U` does the job.

The `changed`, `access`, and `login` entity properties are also dates in UNIX timestamp format. `changed` indicates when the user account was last updated. `access` indicates when the user last accessed the site. `login` indicated when the user last logged in. For brevity, the same value assigned to `created` is also assigned to these three entity properties. The at sign (@) means copy the value of a previous mapping in the process pipeline. If needed, each property can be set to a different value or left unassigned. None is actually required.

Migrating user roles

For reference, the roles are provided by the `user_roles` column, and one of the values is `forum moderator, forum admin`. It is a comma separated list of roles from the legacy system which need to be mapped to Drupal roles. It is possible that the `user_roles` column is not provided at all in the source. The following snippet shows how the roles are set:

roles:
  - plugin: skip_on_empty
    method: process
    source: user_roles
  - plugin: explode
    delimiter: ','
  - plugin: callback
    callable: trim
  - plugin: static_map
    map:
      'forum admin': administrator
      'webmaster': administrator
    default_value: null

First, the `skip_on_empty` plugin is used to skip the processing of the roles if the source column is missing. Then, the `explode` plugin is used to break the list into an array of strings representing the roles. Next, the `callback` plugin invokes the `trim` PHP function to remove any leading or trailing whitespace from the role names. Finally, the `static_map` plugin is used to manually map values from the legacy system to Drupal roles. All of these plugins have been explained previously. Refer to other articles in the series or the plugin documentation for details on how to use and configure them.

There are some things that are worth mentioning about migrating roles using this particular process pipeline. If the comma separated list includes spaces before or after the role name, you need to trim the value because the static map will perform an equality check. Having extraneous space characters will produce a mismatch.

Also, you do not need to map the `anonymous` or `authenticated` roles. Drupal users are assumed to be `authenticated` and cannot be `anonymous`. Any other role needs to be mapped manually to its machine name. You can find the machine name of any role in its edit page. In the example, only two out of four roles are mapped. Any role that is not found in the static map will be assigned the value `null` as indicated in the `default_value` configuration. After processing the `null` value will be ignored, and no role will be assigned. But you could use this feature to assign a default role in case the static map does not produce a match.

Migrating profile pictures

For reference, the profile picture is provided by the `user_photo` column and one of the values is `P01`. This value corresponds to the unique identifier of one record in the `udm_user_pictures` file migration, which is part of the same demo module.  It is important to note that the `user_picture` field is not a user entity property. The field is created by the `standard` installation profile and attached to the user entity. You can find its configuration in the “Manage fields” tab of the “Account settings” configuration page at `/admin/config/people/accounts`. The following snippet shows how profile pictures are set:

user_picture/target_id:
  plugin: migration_lookup
  migration: udm_user_pictures
  source: user_photo

Image fields are entity references. Their `target_id` property needs to be an integer number containing the file id (`fid`) of the image. This can be obtained using the `migration_lookup` plugin. Details on how to configure it can be found in this article. You could simply use `user_picture` as your field mapping because `target_id` is the default subfield and could be omitted. Also, note that the `alt` subfield is not mapped. If present, its value will be used for the alternative text of the image. But if it is not specified, like in this example, Drupal will automatically generate an alternative text out of the username. An example value would be: `Profile picture for user michele`.

Technical note: The user entity contains other properties you can write to. For a list of available options, check the baseFieldDefinitions() method of the User class defining the entity. Note that more properties can be available up in the class hierarchy.

And with that, we wrap up the user migration example. We covered how to migrate a user’s mail, timezone, username, password, status, creation date, roles, and profile picture. Along the way, we presented various process plugins that had not been used previously in the series. We showed a couple of examples of process plugin chaining to make sure the migrated data is valid and in the format expected by Drupal.

What did you learn in today’s blog post? Did you know how to process dates for user entity properties? Have you migrated user roles before? Did you know how to import profile pictures? Please share your answers in the comments. Also, I would be grateful if you shared this blog post with others.