Default HubSpot Blog

Auto-Recovery, Activity Feeds, Host Details and More

[fa icon="calendar"] Apr 13, 2015 2:26:57 PM / by Damien Toledo

Nirmata is pleased to announce new features and improvements to our solution. Our focus has been on resiliency and state management:

  • Service instance auto-recovery
  • Environment activity feed
  • System events in activity feeds
  • Host agent version, docker version & host details.
  • Delete option for specific hosts in a host group.
  • Enhanced pre-validation checks during environment creation.
  • Better handling of failure and recovery scenarios.

Configurable Auto-Recovery

Some users have requested more control over the built-in auto recovery capability in Nirmata. This is especially necessary for certain traditional applications which may be deployed with high availability setup and do not need to recover automatically. As a result, we have added support to disable auto recovery in the Scaling and Recovery policy for applications and environments.

You control auto-recovery in two different ways. First, you can configure auto-recovery as part of the Scaling and Recovery policy. Once a scaling and recovery rule is created for a given application, it is applied to all subsequent environment creation.

auto-recovery

 

You can also control auto-recovery at the environment level. Go to the environment details view and edit the scaling policy.auto-recovery-2

Environment Activity Feed

You can now view the activity feed for an Environment, directly in the the Environment details view. To open the environment activity feed panel, go to the environment details view and click on the activity feed icon.

activity-feed

System Events in Activity Feeds

System events can now be viewed in all activity feeds. This events include all the state transitions of your cloud resources and environments.

system-event

Pre-Validation before Creating an Environment

When a user deploys a new environment, the Nirmata platform now verifies that the most basic resources are available and sufficient to create this environment. If this basic validation fails for one or multiple service instances then none of the services are deployed. The basic validation includes:

  • There is a resource selection rule associated to each service to be deployed
  • There is at least one host in connected state in each host group required to create the environment
  • There is enough memory available across all the hosts in the host groups required to create the environment
  • Ports required to create the environment are available in the host groups.

Note that these checks were done previously, but they were not preventing the environment from being created. It resulted in failures during the service instances creation. Now the validation is performed up-front and the user has to fix these errors before the environment can be created.

View Host Details, Agent Version & Docker Version

You can view the host agent version and the docker version in the host group details view. In order to view this information you will have to update your host agents to the latest version.

You can also view additional details for each host. The information provided is equivalent to the details provided by the Docker commands “docker info” and “docker version”. To display the host details, go to a host group details view and right-click on the “Action” pull-down menu, then select “View Details”.

host-details

Delete Specific Host in a Host Group

If you want to remove a host from a host group, you now have two possibilities: You can decrement the number of desired hosts or you can select a specific host. In the later case, the desired host count is automatically decremented.

Better Handling of Failures and Recovery Scenarios

We have introduced a new state “Unknown” for host groups, hosts, environments and service instances. The goal of this state is to cope better with intermittent connectivity issues and upgrade scenarios. When Nirmata platform loses connectivity with a Nirmata host agent, it marks the corresponding host in an “Unknown” state as well as the service instances running on this host. When an environment is in this state, the recovery algorithms do not kick in and your application keeps running normally. When a host is in an Unknown state, Nirmata platform will attempt to contact the host using the native cloud provider APIs. If the host is not reported by the cloud provider APIs or if it is reported in a failed/down state, then the host will transition to a “Not Connected” state. The service instances will transition to failed state and the recovery algorithms will be applied.

Another improvement has been made in the case of a service instance failure. We now always test if the recovery procedure is likely to succeed before deleting the failed service instance and creating a new one. This avoids having service instances being constantly created and deleted when there is not enough resources available.

Host Agent Upgrade

In order to fully benefit from these changes, we recommend to upgrade your host agents to the most recent version available. In order to upgrade to this version, SSH to your host and then run the curl command that was initially used to install the agent. For example:

sudo curl -sSL http://www.nirmata.io/nirmata-host-agent/setup-nirmata-agent.sh | sudo sh -s aws

See: http://docs.nirmata.io/en/latest/CloudProviders.html#host-setup

Next steps…

At Nirmata, our vision is to provide the best-in-class Enterprise DevOps solution to manage application containers across public and private clouds. We are working hard towards our GA, and your feedback and support helps! Please continue to let us know what features you would like, and how we can further improve Nirmata.

Damien Toledo
Co-founder, VP of Engineering at Nirmata

Topics: cloud applications, Containers, Cloud native, Continuous Delivery, resiliency, Product, microservices, DevOps, Orchestration, Cloud Architecture

Damien Toledo

Written by Damien Toledo

Subscribe to Email Updates

Recent Posts