I use kopf for visualization, and I tried changing the number of replica setting for an index from 17 to 20, making the replication group of 21 using kopf. (total 20 primary shards + 3 availability zones) The observation is, it assigns some number of replica's (mainly group of 60) but others stay unassigned. Any pointers on debugging this issue ? Elasticsearch version 1.7
To share more information, it mainly fails on "too many shards on nodes for attribute: [aws_availability_zone]" but I can clearly see there are many hosts in availability zone without the shards. The hack that I use to force allocate is, I increase the number of shards to relatively high value and cluster picks bunch of shards for allocation, I then reduce the replicas back to what is required and this solves the problem of unassigned shards. The process is very annoying when you have many indices :/
We have following settings for force awareness cluster.routing.allocation.awareness.force.availability_zone.values: zone1, zone2, zone3 cluster.routing.allocation.awareness.attributes: availability_zone
This will get applied to assigning and relocating shards, but will it include new replicas added to the replication group while calculating "shardPerAttribute"?
adding a little more analysis, with 11 replica + 20 primary = total 240 shards Zone 1 - 73 assigned / 7 unassigned Zone 2 - 65 assigned / 15 unassigned Zone 3 - 78 assigned / 2 unassigned I tried manually rerouting one of the shards to host in each zone NO(too many shards on nodes for attribute: [availability_zone]
I also ran reroute with explain to get unassigned_info, "reason": "NODE_LEFT" which is expected.
I am curious, what happens with the number of hosts in all these zones are not equal? will that create any imbalance in assigning shards ? our index setting for "total_shards_per_node" is default
I am a bit confused. How many indices do these 240 shards belong to? How many primary shards do each index have? What is the number of replicas set to for these indices? How many data nodes do you have per availability zone?
Yes all nodes have allocation awareness parameter set, I verified that. Are you saying with 3 zones, we can only have 1primary + 2 replica setting? what if we have more replicas, what is the expected behavior ?
I would expect those replicas to be unassigned. If you wanted to have 5 replicas (6 copies of each shard), you could divide up a zone into parts, e.g. `zone1a`, `zone1b`, `zone2a`, `zone2b`, `zone3a` and `zone3b`. If you leave out or alter the forced allocation parameter, Elasticsearch will try to allocate one shard per zone and will now be able to place one primary shard and 5 replicas. This is quite well explained in the [example given here](https://www.elastic.co/guide/en/elasticsearch/reference/1.7/modules-cluster.html#forced-awareness).
I am curious, with current force settings if I look at replication group of shard 3, it shoud have 9 unassigned shards but all are assigned evenly accross zones. Which is strange and not an expected behavior !
[quote="Dhara_Desai, post:4, topic:93450"] "too many shards on nodes for attribute: [aws_availability_zone]" [/quote] Am curious where this error comes from, given that you have specified the allocation awareness attribute as just `availability_zone`. Is there a mismatch in the configuration?
As per this conversation, I understood that removing "cluster.routing.allocation.awareness.force.availability_zone.values: zone1, zone2, zone3" might resolve this issue. Ill quickly test it as I dont see the need of this setting for now, because we need more than 2 replicas for sure.
[quote="Christian_Dahlqvist, post:18, topic:93450"] Am curious where this error comes from, given that you have specified the allocation awareness attribute as just availability_zone. Is there a mismatch in the configuration? [/quote]
Sorry about that, there is no missmatch. The attribute is aws_availability_zone. And I observed this error when I run reroute for a perticular shard
[quote="Christian_Dahlqvist, post:22, topic:93450"] Here it does however seem to be availability_zone and not aws_availability_zone. [/quote]
Right, thats a configuration template that I picked up form our documentation. But I verified with actual settings.
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation project and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext