Data Center Pulse Blogs


Data Center Operators be a Twit, be Heard

As a whole we data center types don't seem to be a very talkative bunch.  I'm not sure why that is, and I can't say for sure that we're more or less talkative than many other professions. However, anecdotal evidence via the Data Center Pulse Linkedin Group and Twitter seems to supports my assertion.  We pay millions every year on conferences where we're supposed to hear from smart people and mingle (network) with other members of the Data Center community, yet as a group we do very little with social media where the effort is almost free.

Why Data Center Pulse was founded

Data Center Pulse was founded to support dialog, discussion and debate on the important issues facing the data center community.  The DCP community has now reached well over 2500 from over 60 different countries and every conceivable industry vertical. There are literally hundreds of members in every job classification associated with data center ownership and yet, on LinkedIn we have just a handful of regular discussion participants.

What can Twitter do for YOU?

There's also the issue of using (not using) Twitter. Now I realize that not everyone has a Twitter account, but I can tell you that I've found it to be an invaluable tool. When I first signed up, I wondered what I could possibly use this new medium for. I figured that other than the occasional pithy comment or pointer to one of my own blogs, what could this tool really do for me?

Twitter has opened doors for me that I would argue would likely never have opened. Besides making some great friends from across the country and the world, I've also participated in debates, arguments, conferences, and friendly banter using Twitter.  The bad part about Twitter is that sometimes you debate with someone and you are "forced" to learn something new (the horror). Even worse, sometimes that "something new" challenges your assumptions (even more horrible).  However, if you've got the courage, and can stomach the idea that someone else might very rarely know more about something than you do, then Twitter is a great place to find that out.

Twitter Communities

There is a vast community of Twitterers out there, and many of them will be Tweeting about things that you have little or no interest in.  That being said, at DCP we like to treat the Data Center and IT as a system (the Stack). If you're managing and leading your function as part of a larger system, then knowing more about the potential outside influences on your system can be very useful. In other words, while I'm suggesting you find some great data center folks to follow (a few DCP Twits listed below), don't limit yourself. I recommend finding folks in Cloud, IT Infrastructure, Bigdata, BYOD, Business strategy, etc., etc. Over time you can build yourself a community that will help keep you honest about your assumptions. You will also likely create new opportunities for personal growth and improve your contribution to your organization.

Give the Gift

If you're already on Twitter great, if you're not, then sign up. If you know a friend that isn't on Twitter then make the recommendation that they sign up.  I look forward to getting into a discussion about where to find the best Falafel, and maybe even learn a thing or two through open discussion and debate with you.  

Twits to follow from Data Center Pulse group leadership:

@tcrawford (Tim Crawford)
@rhdonalson (Richard Donaldson)
@jmwiersma (Jan Wiersma)
@grmhay (Graeme Hay)
@dudenelson (Dean Nelson)
@mthiele10 (Mark Thiele)

I'm happy to make additional Twits-to-follow recommendations if you're interested. 

 

Where is the rack density trend going ?

When debating capacity management in the datacenter the amount of watts consumed per rack is always a hot topic.

Years ago we could get away with building datacenters that supported 1 or 2 kW per rack in cooling and energy supply. The last few years demand for racks around 5-7kW seems the standard. Five years ago I witnessed the introduction of blade servers first hand. This generated much debate in the datacenter industry with some claiming we would all have 20+ kW racks in 5 years. This never happened… well at least not on a massive scale…

So what is the trend in energy consumption on a rack basis ?

Readers of my Dutch datacenter blog know I have been watching and trending energy development in the server and storage industry for a long time. To update my trend analysis I wanted to start with a consumption trend for the last 10 years. I could use the hardware spec’s found on the internet for servers but most published energy consumption values are ‘name-plate ratings’. Green Grid’s whitepaper #23 states correctly:

Regardless of how they are chosen, nameplate values are generally accepted as representing power consumption levels that exceed actual power consumption of equipment under normal usage. Therefore, these over-inflated values do not support realistic power prediction

I have known HP’s Proliant portfolio for a long time and successfully used their HP Power Calculator tools (now: HP Power Advisor). They display the nameplate values as well as power used at different utilizations and I know from experience these values are pretty accurate. So; that seems as good starting point as any…

I decide to go for 3 form factors:

..minimalist server designs that resemble blades in that they have skinny form factors but they take out all the extra stuff that hyperscale Web companies like Google and Amazon don’t want in their infrastructure machines because they have resiliency and scale built into their software stack and have redundant hardware and data throughout their clusters….These density-optimized machines usually put four server nodes in a 2U rack chassis or sometimes up to a dozen nodes in a 4U chassis and have processors, memory, a few disks, and some network ports and nothing else per node.[They may include low-power microprocessors]

For the 1U server I selected the HP DL360. A well know mainstream ‘pizzabox’ server. For the blade servers I selected the HP BL20p (p-class) and HP BL460c (c-class). The Density Optimized Sever could only be the recently introduced (5U) HP Moonshot.

For the server configurations guidelines:

  • Single power supply (no redundancy) and platinum rated when available.
  • No additional NICs or other modules.
  • Always selecting the power optimized CPU and memory options when available.
  • Always selecting the smallest disk. SSD when available.
  • Blade servers enclosures
    • Pass-through devices, no active SAN/LAN switches in the enclosures
    • No redundancy and onboard management devices.
    • C7000 for c-class servers
    • Converted the blade chassis power consumption, fully loaded with the calculated server, back to power per 1U.
  • Used the ‘released’ date of the server type found in the Quickspec documentation.
  • Collected data of server utilization at 100%, 80%, 50%. All converted to the usage at 1U for trend analysis.

This resulted in the following table:

Server type

Year

CPU Core count

CPU type

RAM (GB)

HD (GB)

100% Util (Watt for 1U)

80% Util (Watt for 1U)

50% Util (Watt for 1U)

HP BL20p

2002

1

2x Intel PIII

4

2x 36

328.00

   

HP DL360

2003

1

2x Intel PII

4

2x 18

176.00

   

HP DL360G3

2004

1

2x Intel Xeon 2,8Ghz

8

2x 36

360.00

   

HP BL20pG4

2006

1

2x Intel Xeon 5110

8

2x 36

400.00

   

HP BL460c G1

2006

4

2x Intel L5320

8

2x 36

397.60

368.80

325.90

HP DL360G5

2008

2

2x Intel L5240

8

2x 36

238.00

226.00

208.00

HP BL460c G5

2009

4

2x Intel L5430

8

2x 36

368.40

334.40

283.80

HP DL360G7

2011

4

2x Intel L5630

8

2x 60 SSD

157.00

145.00

128.00

HP BL460c G7

2011

6

2x Intel L5640

8

2x 120 SSD

354.40

323.90

278.40

HP BL460c Gen8

2012

6

2x Intel 2630L

8

2x 100 SSD

271.20

239.10

190.60

HP DL360e Gen8*

2013

6

2x Intel 2430L

8

2x 100 SSD

170.00

146.00

113.00

HP DL360p Gen8*

2013

6

2x Intel 2630L

8

2x 100 SSD

252.00

212.00

153.00

HP Moonshot

2013

2

Intel Atom S1260

8

1x 500

177.20

172.40

165.20

* HP split the DL360 in to a stripped down version (the ‘e’) and an extended version (the ‘p’)

And a nice graph (click for larger one):

The graph shows an interesting peak around 2004-2006. After that the power consumption declined. This is mostly due to power optimized CPU and memory modules. The introduction of Solid State Disks (SSD) is also a big contributor.

Obviously people will argue that:

  • the performance for most systems is task specific
  • and blades provide higher density (more CPU cores) per rack,
  • and some systems provide more performance and maybe more performance/Watt,
  • etc…

Well; datacenter facility guys couldn’t care less about those arguments. For them it’s about the power per 1U or the power per rack and its trend.

With a depreciation time of 10-15years on facility equipment, the datacenter needs to support many IT refresh cycles. IT guys getting faster CPU’s, memory and bigger disks is really nice and it’s even better if the performance/watt ratio is great… but if the overall rack density goes up, than facilities needs to support it.

To provide more perspective on the density of the CPU/rack, I plotted the amount of CPU cores at a 40U filled rack vs. total power at 40U:

Still impressive numbers: between 240 and 720 CPU cores in 40U of modern day equipment.

Next I wanted to test my hypotheses, so I looked at a active 10.000+ server deployment consisting of 1-10 year old servers from Dell/IBM/HP/SuperMicro. I ranked them in age groups 2003-2013, sorted the form factors 1U Rackmount, Blades and Density Optimized. I selected systems with roughly the same hardware config (2 CPU, 2 HD, 8GB RAM). For most age groups the actual power consumption (@ 100,80,50%) seemed off by 10%-15% but the trend remained the same, especially among form factors.

It also confirmed that after the drop, due to energy optimized components and SSD, the power consumption per U is now rising slightly again.

Density in general seemed to rise with lots more CPU cores per rack, but at a higher power consumption cost on a per rack basis.

Let’s take out the Cristal ball

The price of compute & storage continues to drop, especially if you look at Amazon and Google.

Google and Microsoft have consistently been dropping prices over the past several months. In November, Google dropped storage prices by 20 percent.

For AWS, the price drops are consistent with its strategy. AWS believes it can use its scale, purchasing power and deeper efficiencies in the management of its infrastructure to continue dropping prices. [Techcrunch]

If you follow Jevons Paradox then this will lead to more compute and storage consumption.

All this compute and storage capacity still needs to be provisioned in datacenters around the world. The last time IT experienced growth pain at the intersection between IT & Facility it accelerated the development of blade servers to optimize physical space used. (that was a bad cure for some… but besides the point now..) The current rapid growth accelerated the development of Density Optimized servers that strike a better balance between performance, physical space and energy usage. All major vendors and projects like Open Compute are working on this with a 66.4% year over year in 4Q12 growth in revenue.

Blades continue to get more market share also and they now account for 16.3% of total server revenue;

"Both types of modular form factors outperformed the overall server market, indicating customers are increasingly favoring specialization in their server designs" said Jed Scaramella, IDC research manager, Enterprise Servers "Density Optimized servers were positively impacted by the growth of service providers in the market. In addition to HPC, Cloud and IT service providers favor the highly efficient and scalable design of Density Optimized servers. Blade servers are being leveraged in enterprises' virtualized and private cloud environments. IDC is observing an increased interest from the market for converged systems, which use blades as the building block. Enterprise IT organizations are viewing converged systems as a method to simplify management and increase their time to value." [IDC]

With cloud providers going for Density Optimized and enterprise IT for blade servers, the market is clearly moving to optimizing rack space. We will see a steady rise in demand for kW/rack with Density Optimized already at 8-10kW/rack and blades 12-16kW/rack (@ 46U).

There will still be room in the market for the ‘normal’ rackmount server like the 1U, but the 2012 and 2013 models already show signs of a rise in watt/U for those systems also.

For the datacenter owner this will mean either supply more cooling&power to meet demand or leave racks (half) empty, if you haven’t build for these consumption values already.

In the long run we will follow the Gartner curve from 2007:

With the market currently being in the ‘drop’ phase (a little behind on the prediction…) and moving towards the ‘increase’ phase.

More:

Density Optimized servers (aka microservers) market is booming

IDC starts tracking hyperscale server market

Documentation and disclaimer on the HP Power Advisor

Google's BMS got hacked. Is your datacenter BMS next ?

A recent USA Congressional survey stated that power companies are targeted by cyber attacks 10.000x per month.hacked-scada

After the 2010 discovery of the Stuxnet virus the North American Electric Reliability Corporation (NERC) established both mandatory standards and voluntary measures to protect against such cyber attacks, but most utility providers haven't implemented NERC's voluntary recommendations.

Stuxnet hit the (IT) newspaper front-pages around September 2010, when Symantec announced the discovery. It represented one of the most advanced and sophisticated viruses ever found. One that targeted specific PLC devices in nuclear facilities in Iran:

Stuxnet is a threat that was primarily written to target an industrial control system or set of similar systems. Industrial control systems are used in gas pipelines and power plants. Its final goal is to reprogram industrial control systems (ICS) by modifying code on programmable logic controllers (PLCs) to make them work in a manner the attacker intended and to hide those changes from the operator of the equipment.

DatacenterKnowledge picked up on it in 2011, asking ‘is your datacenter ready for stuxnet?’

After this article the datacenter industry didn’t seem to worry much about the subject. Most of us deemed the chance of being hacked with a highly sophisticated virus ,attacking our specific PLC’s or facility controls, very low.

Recently security company Cylance published the results of a successful hack attempt on a BMS system located at a Google office building. This successful hack attempt shows a far greater threat for our datacenter control systems.

The road towards TCP/IP

The last few years the world of BMS & SCADA systems radically changed. The old (legacy) systems consisted of vendor specific protocols, specific hardware and separate networks. Modern day SCADA networks consist of normal PC’s and servers that communicate through IT standard protocols like IP, and share networks with normal IT services.

IT standards have also invaded facility equipment: The modern day UPS and CRAC is by default equipped with an onboard webserver able of send warning using an other IT standard: SNMP.

The move towards IT standards and TCP/IP networks has provided us with many advantages:

  • Convenience: you are now able to manage your facility systems with your iPad or just a web browser. You can even enable remote access using Internet for your maintenance provider. Just connect the system to your Internet service provider, network or Wi-Fi and you are all set. You don’t even need to have the IT guys involved…
  • Optimize: you are now able to do cross-system data collection so you can monitor and optimize your systems. Preferably in an integrated way so you can have a birds-eye view of the status of your complete datacenter and automate the interaction between systems.

Many of us end-users have pushed the facility equipment vendors towards this IT enabled world and this has blurred the boundary between IT networks and BMS/SCADA networks.

In the past the complexity of protocols like Bacnet and Modbus, that tie everything together, scared most hackers away. We all relied on ‘security through obscurity’ , but modern SCADA networks no longer provide this (false) sense of security.

Moving towards modern SCADA.

The transition towards modern SCADA networks and systems is approached in many different ways. Some vendors implemented embedded Linux systems on facility equipment. Others consolidate and connected legacy systems & networks on standard Windows or Linux servers acting as gateways.

This transition has not been easy for most BMS and SCADA vendors. A quick round among my datacenter peers provides the following stories:

  • BMS vendors installing old OS’s (Windos/Linux) versions because the BMS application doesn’t support the updated ones.
  • BMS vendors advising against OS updates (security, bug fix or end-of-support) because it will break their BMS application.
  • BMS vendors unable to provide details on what ports to enable on firewalls; ‘ just open all ports and it will work’.
  • Facility equipment vendors without software update policies.
  • Facility equipment vendors without bug fix deployment mechanisms; having to update dozens of facility systems manually.

And these stories all apply to modern day, currently used, BMS&SCADA systems.

Vulnerability patching.

Older versions of the SNMP protocol have known several vulnerabilities that affected almost every platform, included Windows/Linux/Unix/VMS, that supported the SNMP implementation.

It’s not uncommon to find these old SNMP implementations still operational in facility equipment. With the lack of software update policies, that also include the underlying (embedded) OS, new security vulnerabilities will also be neglected by most vendors.

The OS implementation from most BMS vendors also isn’t hardened against cyber attacks. Default ports are left open, default accounts are still enabled.

This is all great news for most hackers. It’s much easer for them to attack a standard OS like a Windows or Linux server. There are lots of tools available to make the life of the hacker easer and he doesn’t have to learn complex protocols like Modbus or Bacnet. This is by far the best attack surface in modern day facility system environments.

The introduction of DCIM software will move us even more from the legacy SCADA towards an integrated & IT enabled datacenter facility world. You will definitely want to have your ‘birds-eye DCIM view’ of your datacenter anywhere you go, so it will need to be accessible and connected. All DCIM solutions run on mainstream OS’s, and most of them come with IT industry standard databases. Those configurations provide an other excellent attack surface, if not managed properly.

ISO 27001

Some might say: ‘I’m fully covered because I got an ISO 27001 certificate’.

The scope of ISO27001 audit and certificate is set by the organization pursuing the certification. For most datacenter facilities the scope is limited to the physical security (like access control, CCTV) and its processes and procedures. IT systems and IT security measures are excluded because those are part of the IT domain and not facilities. So don’t assume that BMS and SCADA systems are included in most ISO 27001 certified datacenter installations.

Natural evolution

Most of the security and management issues are a normal part of the transition in to a larger scale, connected IT world for facility systems.

The same lack of awareness on security, patching, managing and hardening of systems has been seen by the IT industry 10-15 year ago. The move from a central mainframe world to decentralized servers and networks, combined with the introduction of the Internet has forced IT administrators to focus on managing the security of their systems.

In the past I have heard Facility departments complain that IT guys should involve them more because IT didn’t understand power and cooling. With the introduction of a more software enabled datacenter the Facility guys now need to do the same and get IT more involved; they have dealt with all of this before…

Examples of what to do:

  • Separate your systems and divide the network. Your facility system should not share its network with other (office) IT services. The separate networks can be connected using firewalls or other gateways to enable information exchange.
  • Assess your real needs: not everything needs to be connected to the Internet. If facility systems can’t be hardened by the vendor or your own IT department, then don’t connect them to the Internet. Use firewalls and Intrusion Detection Systems (IDS) to secure your system if you do connect them to the Internet.
  • Involve your IT security staff. Have facilities and IT work together on implementing and maintaining your BMS/SCADA/DCIM systems.
  • Create awareness by urging your facility equipment vendor or DCIM vendor to provide a software update & security policy.
  • Include the facility-systems in the ISO 27001 scope for policies and certification.
  • Make arrangements with your BMS and/or DCIM vendor about management of the underlying OS and its management. Preferably this is handled by your internal IT guys who already should know everything about patching IT systems and hardening them. If the vendor provides you with an appliance, then the vendor needs to manage the patching process and hardening of the system.

If you would like to talk about the future of securing datacenter BMS/SCADA/DCIM systems than join me at Observe Hack Make (OHM) 2013. IOHM is a five-day outdoor international camping festival for hackers and makers, and those with an inquisitive mind. Starts July 31st 2013.

Note: There are really good whitepapers on IDS systems (and firewalls) for securing Modbus and Bacnet protocols, if you do need to connect those networks to the internet. Example: Snort IDS for SCADA (pdf) or  books about SCADA & security at Amazon.

Source: A large part of this blog is based on a Dutch article on BMS/SCADA security January 2012 by Jan Wiersma & Jeroen Aijtink (CISSP). The Dutch IT Security Association (PViB) nominated this article for ‘best security article of 2012’.

DCIM - Its not about the tool; its about the implementation

 

Failure Success Road Sign

So you just finished your extensive purchase process and now the DCIM DVD is on your desk.

Guess what; the real work just started…

The DCIM solution you bought is just a tool, implementing it will require change in your organization. Some of the change will be small; for example no longer having to manually put data in an Excel file but have it automated in the DCIM tool. Some of the change will be bigger like defining and implementing new processes and procedures in the organization. A good implementation will impact the way everyone works in your datacenter organization. The positive outcome of that impact is largely determined by the way you handle the implementation phase.

These are some of the most important parts you should consider during the implementation period:

Implementation team.

The implementation team should consist of at least:

  • A project leader from the DCIM vendor (or partner).
  • An internal project leader.
  • DCIM experts from the DCIM vendor.
  • Internal senior users.

(Some can be combined roles)

Some of the DCIM vendors will offer to guide the implementation process themselves others use third party partners.

During your purchase process its important to have the DCIM vendor explain in detail how they will handle the full implementation process. Who will be responsible for what part? What do they expect from your organization? How much time do they expect from your team? Do they have any reference projects (same size & complexity?)?

The DCIM vendor (or its implementation partner) will make implementation decisions during the project that will influence the way you work. These decisions will give you either great ease of working with the tool or day-to-day frustration. Its important that they understand your business and way of working. Not having any datacenter experience at all will not benefit your implementation process, so make sure they supply you with people that know datacenter basics and understand your (technical) language.

The internal senior users should be people that understand the basic datacenter parts (from a technical perspective) and really know the current internal processes. Ideal candidates are senior technicians, your quality manager, backoffice sales people (if you’re a co-lo) and site/operations managers.

The internal senior users also play an important role in the internal change process. They should be enthusiast early adapters who really want to lead the change and believe in the solution.

Training.

After you kicked off your implementation team, you should schedule training for your senior users and early adaptors first. Have them trained by the DCIM vendor. This can be done on dummy (fictive) data. This way your senior users can start thinking about the way the DCIM software should be used within your own organization. Include some Q&A and ‘play’ time at the end of the training. Having a sandbox installation of the DCIM software available for your senior users after the training also helps them to get more familiar with the tool and test some of their process ideas.

After you have done the loading of your actual data and you made your process decisions surrounding the DCIM tool, you can start training all your users.

Some of the DCIM vendors will tell you that training is not needed because their tool is so very user friendly. The software maybe user friendly but your users should still need to be trained on the specific usage of the tool within your own organization.

Have the DCIM vendor trainer team up with your senior users in the actual training. This way you can make the training specific for your implementation and have the senior users at hand to answer any organization specific questions.

The training of general users is an important part of the change and process implementation in your organization.

Take any feedback during the general training seriously. Provide the users with a sandbox installation of the software so they can try things without breaking your production installation and data. This will give you broad support for the new way of working.

Data import and migration.

Based on the advise in the first article , you will already have identified your current data sources.

During the implementation process the current data will need to be imported in to the DCIM data structure or integrated.

Before you import you will need to assess your data; are all the Excel, Visio and AutoCAD drawings accurate? Garbage import means garbage output in the DCIM tool.

Intelligent import procedures can help to clean your current data; connecting data sources and cross referencing them will show you the mismatches. For example: adding (DCIM) intelligence to importing multiple Excel sheets with fiber cables and then generating a report with fiber ports that have more than 1 cable connected to them (which would be impossible i.r.l ).

Your DCIM vendor or its partner should be able to facilitate the import. Make sure you cover this in the procurement negotiations; what kind of data formats can they import? Should you supply the data in a specific format?

This also brings us back to the basic datacenter knowledge of the DCIM vendor/partner. I have seen people import Excel lists of fiber cable and connect them to PDU’s… The DCIM vendor/partner should provide you a worry free import experience.

Create phases in the data import and have your (already trained) senior users preform acceptance tests. They know your datacenter layout and can check for inconsistencies.

Prepare to be flexible during the import; not everything can be modeled the way you want it in the software.

For example when I bought my first DCIM tool in 2006 they couldn’t model blade servers yet and we needed a work around for it. Make sure the workarounds are known and supported by the DCIM vendor; you don’t want to create all your workaround assets again when the software update finally arrives that supports the correct models. The DCIM vendor should be able to migrate this for you.

Integration.

The first article did a drill down of the importance of integration. Make sure your DCIM vendor can accommodate your integration wishes.

Integration can be very complex and mess-up your data (or worse) if not done correctly. Test extensively, on non-production data, before you go live with integration connections.

The integration part of the implementation process is very suitable for a phased approach. You don’t need all the integrations on day one of the implementation.

Involve IT Information architects if you have them within your company and make sure external vendors of the affected systems are connected to the project.

Roadmap and influence.

Ask for a software development roadmap and make sure your wishes are included before you buy. The roadmap should show you when new features will be available in the next major release of your DCIM tool.

The DCIM vendor should also provide you with a release cycle displaying the scheduled minor releases with bug fixes. When you receive a new release it should include release-notes mentioning the specific bugs that are fixed and the new features included in that new release. Ask the DCIM vendor for an example of the roadmap and release-notes.

During the purchase process you may have certain feature requests that the vendor is not able to fulfill yet. Especially new technology, like the blade server example I used earlier, will take some time to appear in the DCIM software release. This is not a big problem as long as the DCIM vendor is able to model it within reasonable time.

One way to handle missing features is to make sure it’s on the software development roadmap and make the delivery schedule part of your purchase agreement.

After you signed the purchase order your influence on the roadmap will become smaller. They will tell you it doesn’t… but it does… Urge your DCIM vendor to start a user-group. This group should be fully facilitated by the vendor and provide valuable input for the roadmap and the future of the DCIM tool. A strong user-group can be of great value to the DCIM vendor AND its customers.

Got any questions on real world DCIM ? Please post them on the Datacenter Pulse network: http://www.linkedin.com/groups?gid=841187 (members only)

My Data Center Drives Faster Than Yours

I jumped on my data center the other day and rode into town for a beer at the local saloon. I tell you, these data centers keep getting bigger and faster every year. Did I confuse you? What does a data center mean to you? Is it some converged infrastructure with virtualization on it or is it a container with some racks of computers?


Why am I Bothered?


I probably should just ignore the vendor driven Data Center references, since it did little good when many of us discussed, debated, and argued over twitter where and how the term "cloud" should be used. Unfortunately, I can't ignore it, as I think there are real and important distinctions to be made about what a data center is.
Like many of the blogs and marketing splashes suggest, the "data center" is changing, but not because it's converged infrastructure or bigdata or SD(XYZ). The data center is changing because there are business, technical and sustainability drivers that require it to change. I'm bothered about the liberal use of the term "data center" because I fear it dilutes the focus on the larger picture of your environment. When thinking about your infrastructure and how it needs to work to support your enterprise, you must consider the entire envelope, not just some hardware or software. I've commented on this several times as it relates to the actual data center by using the Data Center Pulse "Data Center Stack". After looking at the Stack you are more likely to appreciate the importance and opportunity of a more holistic approach. The Stack can be immensely valuable, even if it's just used to help with communication and guide your partners.


References to "Data Center"


I won't call out all the offending vendors, but suffice it to say, many of the top names in the industry (Software & Hardware providers) have referenced their "data center" and how it will change organizations, strategy, make you coffee, and with an upgrade get you to Mars. What each of these companies failed to comment on is how their "data center" will affect the actual data center.
It would be great to see some of these providers step up and comment on how modern infrastructure concepts (not data centers) will affect data center design and or put your current data center at risk. I realize doing this would likely extend the sales cycle and scare buyers, so I'm assuming it won't happen. In the absence of vendor action, I felt it necessary to highlight a few considerations that IT buyers should consider.
Impact of Modern Tech on Data Center Requirements
There are myriad ways in which you can affect or impact the performance of your data center, but in this case I'm only going to cover a few of the top considerations associate with IT infrastructure.

Power density:
o 85% or more of existing data centers can't support more than about 6 kW per cabinet or roughly 240 Watts per square foot. With the density of most modern server gear, you can easily get to over 35 kW in a cabinet which equates to ~1400 Watts per square foot. This type of power requirement isn't something that can be easily retrofitted to an existing facility. If you install without retrofitting you will be forced to use an enormous amount of additional floor space to allow for adequate cold air. Or you can rip and replace with in row cooling options. However, most of these changes are only stopgap at best.

Network Diversity and capacity
o Few existing data centers have more than 1-3 network (WAN) providers on premise. With greater use of cloud, adoption of agile business/IT, and a more broadly distributed customer and partner base you must have a wider selection of providers. With so few providers you won't have price contestability and likely won't always get optimal solution and route options.

Weight of racks being installed
o We still say "raised floor" when we're talking about usable data center space. Sad, but true, roughly 85% of existing data centers still use raised floors. Depending on when the raised floor was put in, you might find that your floor can't support the weight of modern gear. This is just one of the important considerations that should help convince you to move to a slab environment for your infrastructure needs.

Accessibility requirements
o As you begin moving in fully configured racks or large storage assemblies you'll soon come to recognize issues with loading docks, door sizes, hall sizes, space between aisles, height (space above racks), etc., etc.

You don't have to say no

There are numerous options for solving the problems I've outlined above, but the most important thing is awareness. Awareness will help you make plans for your transition and put you on a better negotiating footing with your supplier partners. Awareness is also ammunition to help you push for change. You shouldn't lock yourself into a strategy because of assumptions you or your team have made about what will or won't work in your environment. You don't have to address these issues alone, and being armed with knowledge about the gaps in your current data centers (internal and external) will better position you to take advantage of the tech you need, when you need it without putting your business at risk.

Twitter: mthiele10

Before you jump in to the DCIM hype...

dilbert-information-strategy

 

You’re ready to enter the great world of DCIM software and jump right into the hype ?

Do you actually know what you need from a DCIM solution ? What are your functional requirements ?

So before you jump in, let’s take a step back and look at DataCenter Information Management from a 40,000 feet level: the datacenter facility information architecture.

Let’s start with ‘data’;

Data is all around us in the datacenter environment. It’s on the post-it notes on your desk, the dozen Excel files you manage to report and collect measurements and the collection of electrical and architectural drawings sitting in your desk drawer.

A modern day datacenter is filled with sensors connected to control systems. Some of the equipment is connected to central SCADA or BMS systems, some handle all the process control locally at the equipment. HVAC, electrical distribution and conversion systems, access control and CCTV; they all generate data streams. With the growth of datacenters in square meters and megawatts, the amount of data grows too.

The introduction of PUE and focus on energy efficiency have shown us the importance of data and especially data analysis. For most of us this has introduced even more data points, but enabled us to do better analysis of our datacenter’s performance. So; more data has enabled more efficiency and a better return on investment. Some of us could even say they entered the BigData era with datacenter facility data.

DCIM can play a role in the analysis of all this data, but it’s important to know where your data is first. Where is the current data stored ? What are the data streams within your datacenter ? What data is actually available and what data actually matters to your operation ? It’s a false assumption that all the data needs to be pulled in to a DCIM solution; that depends on your processes and your information requirements.

Process

Every datacenter has its collection of structured activities or tasks that produce a specific service or product for our internal or external customer. These are the primary processes focusing on the services your datacenter needs to provide. Examples are operations processes like Work Orders or Capacity Management.

These primary processes are assisted by supporting processes that make the core (primary) processes work and optimize them. Examples are Quality, Accounting or Recruitment processes.

Indentifying the primary and supporting processes in your datacenter enables you to optimize them by executing them in a consisted way very time and checking the output.

If you run an ISO9001 certified shop, you will definitely know what I’m talking about.

To run the processes we need information. Information is used in our processes to make decisions. The needed information can be collected and supplied by an individual or an (IT) system.

When data is collected it’s not yet information. Applying knowledge creates information from data. IT systems can assist us to create information from data, with built-in or collected knowledge.

Indentifying your datacenter processes also enables you to get a grip on the information that is needed to move the processes forward. Is this information available ? What is the quality of the information and process output ? How much time does it take to make it available ? Can this be optimized ?

DCIM solutions can assist you in creating information from data and provide information and process optimization. Most of the DCIM solutions depend on built-in knowledge on how datacenters work and operate, to facilitate this and optimize processes.

DCIM is only one of the applications used to support and optimize our datacenter processes. To support the full stack of processes we need a whole range of applications and tools. These applications can be everything from Planning to Asset Management to Customer Relationship Management (CRM) to SCADA/BMS tools.

Most of us already have some type of SCADA or BMS system running in our datacenter to control and monitor our facility infrastructure. This SCADA or BMS system will handle typical facility protocols like Modbus, BACnet or LonWorks. The programming logic used in most SCADA/BMS systems is not something found in typical DCIM solutions.

With the growing amount of sensors and their data, the SCADA/BMS system must be able to handle hundreds of measurements per minute. It must store, analyze and be able to react-on the provided data to control things like remote valves and pumps. This functionality is also typically not found in DCIM solutions. (So SCADA/BMS does not equal (!=) DCIM.)

Anyone running a production datacenter will already have a collection of applications to support their datacenter processes. You may have a ticketing system, a CRM application, MS Office application, etc.. Some times DCIM is perceived as the only tool you need to manage your datacenter but it will definitely not replace all your current tools and applications.

Model

Now that you have indentified your data, processes and current applications it’s time to focus on what you need DCIM for anyway; define your functional requirements.

One way of assisting you in this definition is creating your own datacenter facility information model.

IT architects are trained in creating information models, so if you have any walking around ask them to assist you.

Example of a model would be the one that the 451 Group created for their DCIM analysis. This is featured in the DCK Guide to Data Center Infrastructure Management (DCIM) (The model doesn’t cover the full scope for every organization, but it helps me to explain what I mean in this blog…)

dcim-451group

The model displays functionality fields what would typically exist when operating a datacenter.

You can use a model like this to identify what functionality you currently don’t have (from a process and application perspective) and what can be optimized.

It also enables you to plot your current tools on the model and indentify gaps and overlap. In this example I have plotted one of my SCADA/BMS systems on the (slightly modified) model:

dcim-plot-bms

I have also plotted the DCIM need for that project:

dcim-plot-dcim

Using models like this will give you a sense of what you actually expect from a DCIM solution and assist in creating your functional requirements for DCIM tool selection (RFP/RFI).

Integration is key

Modern day IT information management consists of collections of applications and datastores, connected for optimal information exchange. IT information and business architects have already tried the ‘one application to rule them all’ approach before and failed. Because creating information islands also doesn’t work, we need to enable applications and information stores to talk to each other.

You may have some customer information about the usage of datacenter racks in a CRM system like Salesforce. You may already have some asset information of your CRAC’s in a asset management system or maybe an procurement system. This is all interesting and relevant information for your ‘datacenter view on the world’. Connecting all the systems and datastores could get really ugly, time consuming and error-prone:

dc-info-connect

IT architects have already struggled with this some time ago when integrating general business applications. This has started things like Service-oriented architecture (SOA) , enterprise service bus (ESB) and application programming interface (API). All fancy words (and IT loves their 3 letter acronyms) for IT architectural models, to be enable applications to talk to each other.

dc-info-connect-soa

The DCIM solution you select, needs to be able to integrate in to your current world of IT applications and datastores.

When looking at integration, you need to decide what information is authoritative and how the information will flow. Example: you may have an asset management system containing unique asset names and numbers for your large datacenter assets like pumps, CRACs and PDUs. You would want this information to be pushed out to the DCIM solution but changes in the asset names should only be possible in the asset management system. Your asset management system would then be considered authoritative for those information fields and information will only be pushed from the asset system to DCIM and not vice versa (flow).

Integration also means you don’t have to pull all the data from every available data source in to your DCIM solution. Select only the information and data that would really add value to your DCIM usage. Also be aware that integration is not the only way to aggregate data. Reporting tools (sometimes part of the DCIM solution) can collect data from multiple datasources and combine them in one nice report, without the need to duplicate information by pulling a copy in to the DCIM database.

The 451group model does an excellent job of displaying this need for integration showing the “integration and reporting” layer across all layers.

Using your own information model you can also plot integration and data sources.

dcim-plot-integrate

Integration within the full datacenter stack (from facilities to IT)  is also key for the future of datacenter efficiency like I mentioned in my “Where is the open datacenter facility API ?” blog.

So, to summarize:

  • Look at what data you currently have, where it is stored and how that data flows across your infrastructure.
  • Look at the information and functionality you need by analyzing your datacenter processes. Indentify information gaps and translate them to functional requirements.
  • Look at the current tools and applications ; what applications to replace with DCIM and what applications to integrate with DCIM. What are the integration requirements and what information source is authoritative ?
  • Create your own datacenter facility information model. Position all your current applications on the model. (If you have in-house IT (information) architects; have them assist you…)

Preparing your DCIM tool selection this way will save you from headaches and disappointment after the implementation.

In my next blog we will jump to the implementation phase of DCIM.

 

More resources:

Full credits for the DCIM model used in this blog, go to the 451Group. Taken from the excellent DCK Guide to Data Center Infrastructure Management (DCIM) at http://www.datacenterknowledge.com/archives/2012/05/22/guide-data-center-infrastructure-management-dcim/

Disclosure: between 2006 and 2012 I have selected, bought and implemented three different DCIM solutions for the companies I worked for. At that time I was also part of either the beta-pilot group for those vendors or on the Customer Advisory Board. That doesn’t make me a DCIM expert, but it generated some insight into what is sold and what actually works and gets used.

Where is the open datacenter facility API ?

For some time the Datacenter Pulse top 10 has featured an item called ‘ Converged Infrastructure Intelligence‘. The 2012 presentation mentioned:stack21-forceX

Treat the DC infrastructure as an IT system;

- Converge in the infrastructure instrumentation and control systems

- Connect it into the IT systems for ultimate control

Standardize connections and protocols to connect components

With datacenter infrastructure becoming a more complex system and the need for better efficiency within the whole datacenter stack, the need arises to integrate layers of the stack and make them ‘talk’ to each other.

This is shown in the DCP Stack framework with the need for ‘integrated control systems’; going up from the (facility) real-estate layer to the (IT) platform layer.

So if we have the ‘integrated control systems’, what would we be able to do?

We could:

  • Influence behavior (can’t control what you don’t know); application developers can be given insight on their power usage when they write code for example. This is one of the needed steps for more energy efficient application programming. It will also provide more insight of the complete energy flow and more detailed measurements.
  • Design for lower level TIER datacenters; when failure is imminent, IT systems can be triggered to move workloads to other datacenter locations. This can be triggered by signals from the facility equipment to the IT systems.
  • Design close control cooling systems that trigger on real CPU and memory temperature and not on room level temperature sensors. This could eliminate hot spots and focus the cooling energy consumption on the spots where it is really needed. It could even make the cooling system aware of oncoming throttle up from IT systems.
  • Optimize datacenters for smart grid. The increase of sustainable power sources like wind and solar energy, increases the need for more flexibility in energy consumption. Some may think this is only the case when you introduce onsite sustainable power generation, but the energy market will be affected by the general availability of sustainable power sources also. In the end the ability to be flexible will lead to lower energy prices. Real supply and demand management in the datacenters requires integrated information and control from the facility layers and IT layers of the stack.

Gap between IT and facility does not only exists between IT and facility staff but also between their information systems. Closing the gap between people and systems will make the datacenter more efficient, more reliable and opens up a whole new world of possibilities.

This all leads to something that has been on my wish list for a long, long time: the datacenter facility API (Application programming interface)

I’m aware that we have BMS systems supporting open protocols like BACnet, LonWorks and Modbus, and that is great. But they are not ‘IT ready’. I know some BMS systems support integration using XML and SOAP but that is not based on a generic ‘open standard framework’ for datacenter facilities.

So what does this API need to be ?

First it needs to be an ‘open standard’ framework; publicly available and no rights restrictions for the usage of the API framework.

This will avoid vendor lock-in. History has shown us, especially in the area of SCADA and BMS systems, that our vendors come up with many great new proprietary technologies. While I understand that the development of new technology takes time and a great deal of money, locking me in to your specific system is not acceptable anymore.

A vendor proprietary system in the co-lo and wholesale facility will lead to the lock-in of co-lo customers. This is great for the co-lo datacenter owner, but not for its customer. Datacenter owners, operators and users need to be able to move between facilities and systems.

Every vendor that uses the API framework needs to use the same routines, data structures, object classes. Standardized. And yes, I used the word ‘Standardized’. So it’s a framework we all need to agree up on.

These two sentences are the big difference between what is already available and what we actually need. It should not matter if you place your IT systems in your own datacenter or with co-lo provider X, Y, Z. The API will provide the same information structure and layout anywhere…

(While it would be good to have the BMS market disrupted by open source development, having an open standard does not mean all the surrounding software needs to be open source. Open standard does not equal open source and vice versa.)

It needs to be IT ready. An IT application developer needs to be able to talk to the API just like he would to any other IT application API; so no strange facility protocols. Talk IP. Talk SOAP or better: REST. Talk something that is easy to understand and implement for the modern day application developer.

All this openness and ease of use may be scary for vendors and even end users because many SCADA and BMS systems are famous for relying on ‘security through obscurity’. All the facility specific protocols are notoriously hard to understand and program against. So if you don’t want to lose this false sense of security as a vendor; give us a ‘read only’ API. I would be very happy with only this first step…

So what information should this API be able to feed ?

Most information would be nice to have in near real time :

  • Temperature at rack level
  • Temperature outside of the building
  • kWh, but other energy related would be nice at rack level
  • warnings / alarms at rack and facility level
  • kWh price (can be pulled from the energy market, but that doesn’t include the full datacenter kWh price (like a PUE markup))

(all if and where applicable and available)

The information owner would need features like access control for rack level information exchange and be able to tweak the real time features; we don’t want to create unmanageable information streams; in security, volume and amount.

So what do you think the API should look like? What information exchange should it provide? And more importantly; who should lead the effort to create the framework? Or… do you believe the Physical Datacenter API framework is already here?

More:

Good API design by Google : http://www.youtube.com/watch?v=heh4OeB9A-c&feature=gv

The green PUE monster

green-monster_thumb

My fellow Datacenter Pulse (DCP) colleague Mark Thiele wrote a good article on the use of PUE in the datacenter industry. He basically argues that you should look at the TCO of the datacenter and have a holistic view (like we promote with the DCP stack)

He opens with the ‘my PUE is better than yours remark’ we see going around in the industry. This is mentioned before by the Green Grid; the misuse of PUE. (I have written several articles about this issue on my Dutch datacenter blog)

Well guess what… we kinda created this monster ourselves;

Commodity

Datacenters are going commodity. Differentiation is in efficiency; looking at all the signs in the datacenter industry: the datacenter is becoming a commodity. The general idea is that any successful product will end up a commodity. See Simon Wardley’s excellent presentations on this subject.

Example signs are:

(got lots more, but for an other blog and beyond the point now…)

Most of this leads to technology being available to everyone. You still need a big budget to start building a datacenter but cost is coming down thanks to advances in IT technology (like: do we still need to build TIER4?) and facility technology. Great example is the option to build modular and from an assembly line. This reduces datacenter production cost and you can spread CAPEX across time.

For any commodity product the competitive advantage diminishes, prices go down and if you’re in the market of selling the stuff: your differentiation is on price and efficiency.

So we push towards efficiency and lowering our operational cost, to be cheaper that the competition. It doesn’t matter what type of datacenter owner you are (enterprise, cloud, co-lo, whole sale..) the competitive advantage is in the operational cost. Everyone already has a SAS70, ISO27001, ISO9001, PCI, etc.. certificate. Everyone can get and build the technology and at some point it doesn’t pay off to worry about the small engineering details anymore; the cost are too high to get enough competitive advantage in return.

Marketing & Sales

If you’re in the datacenter co-lo or whole-sales business then your marketing and sales guys are trying to look for the (next) USP.

With the datacenter efficiency on the rise, it’s nice to have a single figure to tell your customer you are more efficient that the competition (and able to offer lower prices with it..).

PUE presented a great opportunity to get a nice USP for our sales guys. And I can’t blame them: it’s hard to sell a (going) commodity product and it’s something our customers are asking for.

Sustainable & marketing

The whole ‘green wash/hype’ (before the economic crisis) added to the misuse of PUE from a sales and marketing perspective. Everyone was, and still is, building the next greenest datacenter on this planet. It would be nice to have a single figure to promote this ‘green stuff’ and tell the world about this new engineering marvel… and PUE is happily adopted for this.

On the enterprise/cloud datacenter owners side it’s not very different.. it's just another angle. Especially the larger datacenter (cloud) owners are pushed from a marketing perspective thanks to organizations like Greenpeace. They demand full disclosure of datacenter energy use and sustainability on behalf of the world’s population.

PUE is a nice tool to get some of the pressure off and tell the world you have the greenest datacenter operation, like Facebook, Google, Microsoft and others are now doing. The problem with those numbers is they don’t say anything without the full context. And like Mark says in his article; it doesn’t say anything about the full datacenter stack.

Sustainable & Government

The really interesting thing I have been dealing with recently is government and PUE.

Many state and local government organizations are looking at datacenters from an energy use perspective. This focus is thanks to reports from research and governmental institutes (for example EPA) and organizations like Greenpeace.

Government can either use the ‘carrot (subsidies) or stick (penalties)’ approach but it always requires some kind of regulation and some form of measurement. They happily adapted PUE for this.

The main problem with this is that PUE is not ready to be audited. Sure general guidelines for PUE calculation are available, but it still doesn’t have all the full details and leaves opportunity to ‘game the system’. It also doesn’t address the full holistic datacenter approach and the fact that you cannot use PUE to compare different datacenters.

Datacenter customer

Some of the governmental organizations are also using PUE for their co-lo tenders. And they are not the only ones. More and more RFP’s on the datacenter market state a specific PUE number or target with additional credits for the lowest number.

I can’t blame the datacenter customer; the datacenter is a pretty complex facility box if you’re not educated in that industry. It would be nice to just throw some numbers at it like TIER1-4 and PUE1-2 and select the best solution provider on the market…

I can’t blame the datacenter owner for the answer with 'the best possible number he can think of'… they just want the business…

Datacenter vendor

As usual I’m part of the problem also. When I go shopping for the next big thing in datacenter technology, PUE is always part of the conversation.

Datacenter equipment vendors (especially cooling and power) are happy to throw some numbers my way to convince me that their solution is the most efficient.

The PUE numbers are on the equipment spec sheets, on the vendor website and in the sales meeting… but at the end I always have to conclude that our little number dance really doesn’t tell me anything. Just like Mark mentions: I go back to TCO.

Our own monster

So… we created this monster as a datacenter industry. Owners and consumers. Government and industry groups. I can also assure you that this will continue and will also happen to some of the great other Green Grid initiatives like CUE, WUE and the Maturity Model. It’s just a way of trying to put a number on what can be a very complex stack of functions called 'the datacenter'.

Only way to diminish the monster is to educate our industry colleagues and our customers. And stop throwing the numbers around…

For the record1: the use of PUE

Don't get me wrong, I love PUE and have used it successfully many times to benchmark my own data centers and prove to senior management that we progressed in efficiency. It's also the way the EU Datacenter Code of Conduct works: show progress every year by calculating the PUE for the same datacenter with the same calculation method. This way you will have apples - apples situation.

It's just that I don't like the way PUE is used in the above examples...

For the record2: On datacenter and green

Full disclosure: Datacenters are not green and will never be green. They consume large amounts of energy and will consume more energy in the future. All thanks to Jevons Paradox and the fact that we start to consume more CPU/Storage if prices go down…. And yes, someone has to build datacenters for all the CPU’s and disk’s…

See GreenMonk - Simon Wardley on ‘Cloud & Green’

I do think that:

1. We should aim to build the most efficient datacenters on the planet. It should not only focus on energy consumption but on all resources used for build and operation.

2. We cannot keep up with the rising demand from an energy-grid perspective. We should start looking for alternative ways for datacenter builds and deployment.

 

Starting a blog avalanche

After some time of blogging on my Dutch website (www.janwiersma.com ), some of my English friends asked me if I could re-publish and extend some of them in English. This way the knowledge can be shared with a bigger audience.

While my English writing style definitely needs some work, I will give it a go at my DCP blog.

Get ready for an avalanche of blog’s. ;-)

Server Power Consumption Cost Exceeds the Cost of the Server – SO WHAT!

Saying a modern server uses too much power is like saying a train uses more power than a horse drawn wagon. Of course it does, but it also does way more work. Let's not forget what's important to the question of cost and that is simply how much work is the server performing?

I've contributed to this noise in the past, but recently I've had a change of heart. After reading several recent articles that mention the cost of power exceeding the cost of the server, I've come to a new realization on this issue. The power use must be measured as it pertains to work potential.

Watts vs. Work Output

A single server today is twice as capable as a single server from two years ago.  There are more CPUs and each CPU is more powerful. There's more memory and each dimm is faster and has more capacity than its predecessor. So, even though the amount of power being used by each server has gone up, the actual "watts per work output" has gone down.

We must always be looking to make the solutions we create more efficient in design, life cycle and use characteristics.  However, we shouldn't overlook the fact that these systems are replacing work that would otherwise use much more power.

So, if you're really worried about how much power your servers are using, there are myriad opportunities you can pursue;

Greater virtualization
Use of cloud
Improved Management platforms for your infrastructure (Cloud & Virtual)
Applications (software) written with efficiency of operation as a consideration
Right sizing your environments
Using a well design and efficient data center
Etc., etc..

What you shouldn't do is fall into the trap of "not seeing the forest for the trees". Focus on the activities that generate business benefit and do those actions with sustainability and efficiency as part of your process. If you start trying to extend your servers life instead of implementing solutions that create business opportunity you'll be missing the forest. You might also find that many companies are actually improving their bottom line by replacing efficiently utilized servers more often, so they can reduce wasted energy and increase the work output per Watt.