Smart from the ground up features datacenterdynamics create database link

To complete their vision, the team had to build an application management tool – called App360. The end result is a more stable data center ecosystem, which operates more efficiently, requiring less energy. It also demands less management effort, as it handles requests and incidents automatically instead of manually, and can even fix problems before they happen.

The project was all about the results, not technology for its own sake, ICICI managing partner Imran Shaik told DCD in ICICI’s Award submission: “Being one of the topmost banks in India, it is imperative to ensure that the availability and performance are given the topmost priority since technology is the backbone of all banking services.”

The bank has two data centers: a primary facility and a disaster recovery data center.

Its IT hardware is virtualized using VMware and delivered as a service – apart from five percent of the x86 systems, which are not virtualized because of requirements for compliance or performance.

ICICI has unified performance management, linked to business availability, for the infrastructure and the application workloads which run on this. Across the bank, services are backed by a service level agreement (SLA), with tiered priority similar to the platinum, gold or silver service tiers offered by commercial data centers. The cost of usage is apportioned to each group within the company and recovered with a chargeback mechanism.

To enable all this, numerous tools have been implemented. IoT sensors were installed for environmental monitoring and capacity planning, managed by Vertiv’s Trellis data center infrastructure management (DCIM). The Trellis tool aggregates data from temperature and humidity sensors to create real-time thermal mapping and visualization at the rack level, so the heat load can be monitored and space usage optimized.

Trellis helps explore the data center’s total consumption, energy costs, and PUE. Combined with intelligent PDUs this helps ICICI to manage power and efficiency pro-actively. The DCIM operates as a closed loop system generating SNMP traps and various alerts for upstream systems such as the centralized building management system.

App360 provides a complete mapping of all applications, virtual machines, physical servers, storage devices, backup systems and networks. It has a built-in alerting mechanism for events like SSL certificate expiry, produces reports, and tracks server activity and incidents. It also has automatic scripts and sends reminder emails: “No such consolidated tool existed in the market place,” said Shaik.

The App360 tool ensures any unforeseen eventualities can be managed efficiently, avoiding the business impact of downtime. In an incident, the support team can check App360 for infrastructure details: when a base server goes down, it provides information about the applications which are hosted on it, so the right teams can be notified.

The bank has an incident management process with a centralized IT command center which uses multiple tools and a customized Service Manager tool from HP, to give management control over incidents and their follow-ups. The tools include Oracle Enterprise Manager, HP Operations Manager, Windows SCOM, NetApp OCI, Appnomics, Appdynamics, Dynatrace, HPOVM, SCOM, and OpsCentre – all cascaded with the in-house App360 tool.

Rack level power is handled by Sentry Power Manager (SPM), which enables socket level monitoring. Predictive analysis enables power management and capacity planning. The bank deployed modular power distribution units (PDUs), which helped manage the dynamic load, and provide a further cost saving on capex.

All this data is held in a Hadoop data lake, including structured, un-structured and semi-structured data, with data discovery, optimization and analysis. It can be accessed quickly, and searched by multiple factors, including the IP address of servers, appliances, load balancers, switches and storage, all from a single menu. All this helps staff to generate reports quickly and respond to contingencies.

“Hotspots in any data center are always the biggest concern,” said Shaik. “Hotspots identified through DCIM are addressed through optimizing IT equipment placement, realignment of the raised floor tiles, adding active tiles, adding baffling/blanking panels, deploying rack mount fan trays for hot air exhaust and fine tuning the cold aisle containment.”

This leaves the facilities performing at optimal environmental conditions meeting ASHRAE standards. The precision air conditioning (PAC) units are using 48 percent less power, and the chiller power consumption has been cut by 13 percent – generating a direct benefit to operating expenditure (opex).