An Elasticsearch from the past

Here’s a procedure I came up with in order to migrate an elasticsearch 1.1 database to version 6 (actually 6.4 but probably any 6.x version).

  1. Fire up a temporary elasticsearch version 1.1

Fetch the tar.gz version from https://www.elastic.co/downloads/past-releases/filebeat-1-1-2 and untar it.

Use the following basic configuration file

$ egrep -v '^[[:space:]]*(#|$)' ~/tmp/elasticsearch-1.1.2/config/elasticsearch.yml 
http.port: 9202
transport.tcp.port: 9302
path.conf: /home/imil/tmp/elasticsearch-1.1.2/config
path.data: /var/db/elasticsearch

Note that I changed the standard ports to $((standard_port + 2)).

From the untarred directory, lauch elasticsearch

$ ES_HOME=$(pwd) ES_INCLUDE=$(pwd)/bin/elasticsearch.in.sh bin/elasticsearch -p ./es.pid

Check that it’s working correctly by listing indexes

$ curl -X GET "localhost:9202/_cat/indices?v"

Next, as pointed out by the documentation, we need to create the indexes, types and mappings in the new database. In order to do so, we will need the previous mappings

$ curl -X GET "localhost:9202/rhonrhon/_mapping/"|json_pp > rhonrhon.mapping

In my example, the original index (version 1.1) is called rhonrhon

{
   "mappings" : {
      "gcutest_infos" : {
         "properties" : {
            "date" : {
               "type" : "date",
               "format" : "dateOptionalTime"
            },
            "topic" : {
               "type" : "text"
            },
            "users" : {
               "type" : "text"
            },
            "ops" : {
               "type" : "text"
            },
            "channel" : {
               "type" : "text"
            }
         }
      },
      "gcu_infos" : {
         "properties" : {
            "topic" : {
               "type" : "text"
            },
            "ops" : {
               "type" : "text"
            },
            "users" : {
               "type" : "text"
            },
            "channel" : {
               "type" : "text"
            },
            "date" : {
               "type" : "date",
               "format" : "dateOptionalTime"
            }
         }
      },

...

  }
}

As you can see, this index had multiples types in it, as it was perfectly legit in versions < 5, but it turns out elasticsearch is removing mapping types, and moreover, now only supports one type per index:

Indices created in Elasticsearch 6.0.0 or later may only contain a single mapping type. Indices created in 5.x with multiple mapping types will continue to function as before in Elasticsearch 6.x. Mapping types will be completely removed in Elasticsearch 7.0.0.

It was then mandatory to split our previous mapping by type

{
   "mappings" : {
      "gcu_infos" : {
         "properties" : {
            "topic" : {
               "type" : "text"
            },
            "ops" : {
               "type" : "text"
            },
            "users" : {
               "type" : "text"
            },
            "channel" : {
               "type" : "text"
            },
            "date" : {
               "type" : "date",
               "format" : "dateOptionalTime"
            }
         }
      }
   }
}
  1. Now fire up the target elsticsearch 6 service as you normally would

Note that according to the official migration guide, you should set refresh_interval to -1 and number_of_replicas to 0 for faster reindexing. I didn’t do it, all went well but took a couple of minutes.

Create the mappings for each type / index in the new elasticsearch database

$ curl -XPUT http://localhost:9200/gcutest_infos -H 'Content-Type: application/json' -d "$(cat gcutest_infos.mapping)"
$ curl -XPUT http://localhost:9200/gcu_infos -H 'Content-Type: application/json' -d "$(cat gcu_infos.mapping)"
$ ...

curl -X GET "localhost:9200/_cat/indices?v" should display one index for each type.

Now in order to start the actual synchronization, we must indicate the new cluster where to reindex from and to

{
  "source": {
    "remote": {
      "host": "http://localhost:9202"
    },
    "index": "rhonrhon",
    "type": "gcu_infos"
  },
  "dest": {
    "index": "gcu_infos"
  }
}

And finally trigger the migration by reindexing every index / type

$ curl -XPOST http://localhost:9200/_reindex -H 'Content-Type: application/json' -d "$(cat reindex_gcu_infos.json)"
$ curl -XPOST http://localhost:9200/_reindex -H 'Content-Type: application/json' -d "$(cat reindex_gcutest_infos.json)"
...

There we go!

$ curl -X GET "localhost:9200/_cat/indices?v"
health status index         uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   gcu           tr_rFfjGRDKjVtrDHK46fg   5   1    6083599            0    973.2mb        973.2mb
yellow open   gcu_infos     ul7-xX8AT4ugckLn_8a-gQ   5   1      87161            0     57.7mb         57.7mb
yellow open   gcutest_infos n2OE75poTV-2gDms1Eteaw   5   1      43644            0      3.9mb          3.9mb
yellow open   gcutest       -rTMi8ZFRNicQfvvA9FDWw   5   1        931            0    212.5kb        212.5kb

See you on next backward compatibility breakage!