Duncan Epping is Chief Technologist working for VMware in the Office of CTO of the Storage and Availability business unit. He is specialized in Software Defined Storage, hyper-converged infrastructures and business continuity/disaster recovery solutions. Furthermore, he is also owner/author of VMware Virtualization blog Yellow-Bricks.com, which has been voted number 1 blog eight consecutive times by the virtualization community. He explained us why businiess continuity is so important for all companies and which are the risks of not having such a plan. We also asked him what he thinks the virtualization world has in store for the future.

– Duncan, your blog Yellow Bricks is the hottest virtualization blog in 2017 and you are listed as one of the most influential people within the virtualization industry. How did your adventure with virtualization started and why did you decide to start your blog?

First of all, thanks for the compliments. When I got started with blogging there weren’t too many people blogging yet. There was Scott Lowe and Mike Laverick. Chad Sakac had just started his blog and I had just stopped with my other writing efforts (alternative music). I always like writing and figured when I stopped writing about music (interviews, reviews of albums etc.) I would find a different topic.

Around that time I worked for a consultancy company and was designing virtual infrastructures and deploying these on a daily basis. I regularly ran in to issues and I wanted to document these so that I had the solution at hand. I figured I would share those “notes” through blogs and that is how it all started. Things escalated fast I must say. I joined VMware within a year and the popularity of my blog continued to grow steadily. I was very humbled, and surprised, I entered the “top bloggers” list. I have been blogging for about 9 years now, and still enjoy it!

– Why is business continuity crucial for business and how companies can obtain it?  

Business Continuity and/or a disaster recovery plan is often overlooked. People design an environment and know they have a backup running somewhere and think that is enough. It is surprising how little people know (or have documented) about recovering from zero. What if the datacentre burns down, how do I get all my services/applications up and running again? This is a very challenging task and requires a lot of thought, testing and unfortunately documentation.

It all starts with a discussion with your application/business owners. Identify the applications which are critical to the existence of you company and determine what the expected recovery point objective and recovery time objective is. How long can the application be offline, and how old can the recovered data be? Go over the same exercise for any facilitating applications and services like Active Directory/DNS etc. Based the outcome you determine what works for what. Backup solution for applications which can tolerate days of downtime. Replication (sync or async) for others. And this was the easy part.

Now you will need to decide which tooling/products to use. Create a plan/ process, document and test it end to end, repeat that occasionally. There’s a lot of work that goes in to this.

– Is there any specific sector or company type for which the business continuity is more important?

I believe that all companies should have a business continuity plan. It could be as simple as: phone up company X and restore from tape. But something needs to be documented and decided. The risks should be well understood and accepted by the application and business owner. We as IT provide a service, and we need to ensure it aligns with the requirements of the business/application owners. This goes for every sector, even if you have agreed that there’s no need for a business continuity plan, this should be documented.

– What are the risks of not having a business continuity plan?

I think the risks are pretty straight forward, loss of revenue. And this can range from a couple dollars/euro’s to millions. Some of the enterprise organizations I worked with had multiple locations with active datacentres across these locations. Not only from an infrastructure perspective but also from an application point of view. Avoid downtime (or minimize) at all cost is their motto, simply because every minute of downtime can cost over 100.000 euro. Even higher numbers are not uncommon.

So this is very important to realize when you have these discussions with application and business owners. They should be able to tell you what the cost of downtime would be, at the same time they need to know what the cost of protection (at a certain level) would be. Sync replication comes at a cost, and so does a backup solution. Having these conversations without knowing the cost is useless, as most people will want maximum protection. Until of course they find out what the cost is, then reality sinks in and a selection will be made based on what is crucial to the company and what is not.

– What is, in your opinion, the future of virtualization?

Very challenging question, who can predict the future? If you ask me from a virtualization perspective we’ve seen the biggest shift/features/functionality already. The big change in the upcoming year is what is surrounding that layer. Storage, Networking and Backup are being disrupted as we speak. We have the hyper-converged solutions slowly taking over the world of “storage” (of course there is more to it, but this is what they are mainly displacing).

Then we have products like NSX changing how networks are designed and security policy are applied. Moving networking and security services into software which runs in the hypervisor instead of security and other services at the edge. Then there’s of course the application layer, with container technologies and schedulers changing how we deploy new services and how flexible we are in scaling out and in. And even more exciting IoT. Every device is connected in a couple of years. Not just your phone and laptop, but your watch, fridge, car etc. What do we do with these edge compute instances, what do we do with all the data gathered by those edge devices. Many things to think about.