Secrets of Building a Successful Big Data Team

Note: an edited version this story appeared on the Hortonworks blog on November 10, 2017.

Having a dedicated big data team is critical to achieving successful outcomes with the data you’re collecting. This post will discuss the elements of building a successful big data team and how this team operates within your organization.

Step One: Achieve Executive Buy-In

Ask any business school professor or seasoned project manager what one has to do first and they will all say the same thing: obtain executive buy-in. The most fundamental part of any business transformation is buy-in and enrollment from the top. A top down approach, and big approvals, are a key step in building a big data story in your organizations. Unless this step is done, don’t pass go, don’t collect $200: work until you have executive approval.

Step Two: Set Up a Center of Excellence

Building a center of excellence, a shared employee facility that develops resources and best practices, is the next step in this process. This CoE could be as small as one person, or it could be as big as you want. The CoE’s members should comprise a subject matter expert from each impacted group across the organization in order to adequately represent the way the transformation is happening. In this way, not only have you gotten buy-in from the executive level to start the overall transformation, but you have participation across the organization so all affected departments can feel heard and acknowledged.

Part of building a proper and maximally effective center of excellence is to encourage the naysayers. Platitudes are fantastic, and open mindedness is a great thing to strive for, but the reality is that everyone has their own agendas and different original points of view. In building the CoE, you want cheerleaders as well as skeptics in order to keep debate alive and valuable. Ironically enough, the naysayers end up leading the CoE often; they become most passionate over time because once their objections have been overcome, they understand why the transformation is so important.

Step Three: Building the Team

Once your center of excellence is in place, the bulk of building your big data team lies in finding individual employees to flesh out the team. This is about 75% art, 25% science. From the science perspective, you’ll want to screen workers through the job description, background requirements, and appropriate thresholds of experience. You will want workers with 7-12 years of experience in IT and some exposure to production Linux; data warehouse skills and experience are a very big plus. Unfortunately, this won’t get you all the way there: the industry at this current point in time doesn’t have skill set in body of work to make it easy to find workers just on those limited merits. The current H-1B situation is actively contributing to the dearth of objectively qualified candidates. It is akin to trying to find a needle in a haystack.

This is where the 75% art part comes in: you build your team from folks with personality and passion who also come from relevant backgrounds. How many candidates you interview are both willing to learn Hadoop but also have the passion to do so? The interview and subsequent in-person conversations is where you will find this passion and sense of opportunity. These soft skills have to be found in a face to face interview, and your interview process should dig into what the candidate’s exact experience translates into. You may also want to consider pushing candidates into a live demo where they perform a real world task, and then discuss how they solved problems you stage. Many times candidates are unsuccessful at completing a demo, but the real key is, can they explain why?

You will also find the best candidates are either toward the beginner of their careers or toward the end, and not necessarily in the middle. For more experienced resources, there is a familiarity with the “wild west” that is Big Data these days as it bears resemblance to IT 20 years ago. Things weren’t integrated, and staff had to do the heavy lifting: build it, script it, troubleshoot it. That innate ability to self-start is an asset. For younger resources, they are quick to adopt tech that can build scripts and automate, which is also a useful skill.

Leaders of these teams should be neutral parties in the organization with no self-driving interest other than to help the company overall change. If a leader does end up being from the department that is funding the project, that interest will often eclipse the greater good. The ideal leader is an employee who is not fully integrated, almost a third party.

Finding Talent

Finding this talent is also a challenge. One way is through the Hortonworks University, a program that 12-15 colleges nationwide provide in order to establish Hadoop as a fully accredited part of their curriculum. Hortonworks pays the costs of the program incurred by the university so long as the school makes the courses part of the curriculum. You might also consider recruitment days at local universities, asking professors, who stands out? Who solves things?  Provide internship and trial opportunities for those names you receive.

Word of mouth is also a proven way to find candidates. The raw truth is that the Big Data community is a small enough group in world that if you happen to be really good at Hadoop, then people know who you are.

The Last Word

Ultimately, a big data transformation is an enablement opportunity to get your entire organization to go learn. Over time, we can all get stale, but this transformation can be a driver of learning, a place to get hands dirty with something new, and an opportunity to create new subject matter experts. Don’t be afraid to use this rubric to build a successful team.

Making Big Data More Accessible

An edited version of this article was published on the Hortonworks blog. Credit: Hortonworks

Big data, and storing and analyzing it, has the reputation of being a complex and confusing. However, because of the great insight that can be taken from big data, it’s important to allow data science to be more easily accessible.

How can business leaders lower the barrier to entry to make data science more accessible to everyone? I see four ways this is possible.

Invest in natural language capabilities. Math sometimes has the tendency to scare people away, so if your plan is for everyone to learn Python and reduce their questions to a formula, I have news: that is not going to bode well for a successful uptake. Instead, look for platforms that allow users to query data  sets using standard English questions, like “how many more sales did I make this year rather than last year?” or “What are the top five things my customers purchase along with Widget Model A?” This type of natural language processing makes accessing large quantities of data easier and more approachable, but it also has a side benefit of encouraging interest and curiosity in further analyzing data. As your users ask natural questions and receive straightforward answers, they think of more questions, which leads to more answers and more insight. Save the math and SQL-like queries for your developers and put a friendly front end on your data.

Make visualization an integral part of your analysis and presentation. You surely have heard the old phrase, “I’m a visual person.” Some people simply absorb and digest information better if it is presented visually. There are currently many platforms and frameworks on the market that can take a raw data query result and turn it into a rich graphical answer to a user’s question. Further, the more sophisticated of these platforms can allow you to interact with certain subsections of a visualization, such as exploding out a pie chart into smaller subpieces or layering statistics onto a map of a certain geographic location. Even heat maps and gradients can make statistics pop into life and really enhance your users’ understandings of exactly what the data is telling them. Visualizations can also help to enhance the encouragement to ask more questions and interrogate the data further by creating a rich, inviting atmosphere to discover new ways of thinking about data.

Smart small and then aim to grow your user base. Chances are, there is a vocal community of employees or contractors within your organization who are already clamoring to access new data science tools. Perhaps you have already harnessed their enthusiasm and enrolled them in a controlled pilot of your data deployment. If you have not, however, consider finding a group of individuals excited about the potential for big data and data science and invite them to preview your plans and test systems and provide very valuable feedback about what they like, do not like, and wish to see or need to have to augment their roles. After all, the best camp fires start with proper kindling; a data portal project is no exception to this rule. Beginning with a small but dedicated group of users is a tried and true method of getting a successful project off the ground before expanding it so quickly that it outgrows its capabilities.

Embrace the power of data on the go. Mobility has changed the world around us, with computers more powerful than the space shuttle available in pocket sized form off the shelf at Target for $80 or less. Many—I dare say most—professionals have smartphones or tablets, some either issued by or paid for by their employer. This means you have an audience that expects to be able to get answers on their mobile devices whenever and wherever they are. Your data science projects and deployments must be available in a mobile friendly format, either with responsive web design, customized native apps for the popular mobile device platforms, or perhaps some combination of both paths. That mobile app ought to also satisfy the points we have already established, including supporting natural language and providing rich visualization support with the capability to touch, drag, pinch, zoom, scroll, and more. The bottom line: in this day and age, you simply must make data accessible on mobile devices.

Is Windows to Go a Good Solution for the International Airline Laptop Bans?

An edited version of this story ran on on August 8, 2017. Credit: Computerworld

Often Microsoft presents technological solutions to problems that only a tiny percentage of its customer base has. Windows to Go was just such a feature—a nice solution to a problem that was virtually non-existent back when it was first released in 2011. However, six years later, that non-existent problem could very well be widespread.

What is Windows to Go? It’s a way to take a Windows installation with you on a USB thumb drive. You pop that thumb drive into any computer, boot from the USB, and your personalized installation of Windows—with all of your applications and files and access to corporate resources—is there. When finished, shut down, unplug the USB thumb drive, and away you go. It’s essentially portable Windows.

Windows to Go becomes more attractive in a world that seems to find traveling with electronics to be a security threat. You probably recall the recent news of the ban on laptops from all flights entering the United States from both selected Middle Eastern countries, as well as, more recently, flights coming from Europe. While this ban was lifted, more stringent security protocols are reportedly being developed for both domestic and international flights.   We could soon be entering a world where laptops are either checked in the baggage hold at airports without fail or not brought on trips at all—or a world in which officers at ports of entry demand access to electronics for either cursory or in-depth examinations. Having a nice USB thumb drive tucked away somewhere could be a real asset.

Having laptops subject to examination, or possibly locked away outside of an employee’s purview, has obvious implications for enterprises around the world. Many organizations have security policies that prohibit employees from leaving their corporate laptops unattended. Many organizations do not, as a matter of policy, encrypt the local hard drives of laptops they issue to employees. (This is very obviously a mistake in today’s world, but that does not change the reality of the situation.) Many organizations send field workers into some very remote and insecure areas of the world, often with real business assets and trade secrets stored in digital form on workers’ laptops.

These types of security protocols make it more likely that you will be separated from your laptop. Your business travelers have to put notebooks with company secrets somewhere else not within their direct control and they have virtually no say what happens to those notebooks when they are outside your travelers’ fields of vision. For most enterprises, this is far too much risk.

But that risk is a lot lower when you take Windows with you on a thumb drive and worry about the actual PC you use whenever you get to where you are going. Let’s learn a little more about Windows to Go.

What is Windows to Go?

Windows to Go was introduced in the Windows 8 release wave as an alternative to virtual desktop infrastructure: it is essentially a portable, entirely self-contained installation of Windows that you use on a USB thumb drive—that drive needs to be USB 3 in order to have the read, write and data transmission speeds necessary for a modern computer to run an operating system off of it. But what you end up with, after you configure it properly, is an entirely self-contained computer for a knowledge worker that is encrypted and fits in one’s pocket. You can pop it in your travel bag, in the car, even in your socks (if you are that type of person) and all you need to do is plug it into any reasonably modern PC, boot off the USB drive, and your OS, documents, wallpaper, personal settings, applications, and everything else is right there for you. This copy of the OS is managed through an IT department and thus it can have VPN software on it, or if you have configured DirectAccess, that copy of Windows can reach out over the Internet and retrieve its managed settings, Group Policy object configuration, and so on.

There are some key differences with Windows to Go, in its default configuration, as opposed to a similar copy of Windows installed on a regular fixed drive in a PC as you have come to expect:

  • The local drive in the computer on which Windows to Go is run is hidden by default. This keeps whatever crap is on the local system from seeping its way onto the Windows to Go USB drive as well as helps users properly save and retrieve documents to the USB stick. You can disable this functionality, but it is more secure to leave the hiding feature on.
  • Upon the first boot on a new Windows to Go target computer (that is, the “guest hardware” into which you plug the Windows to Go USB stick), a process goes through and identifies the right hardware drivers for the target system and enables and installs them. This process may reboot the computer several times, after which the boot process will proceed straight into Windows.
  • Windows to Go detects drive removal. Windows in this configuration will pause the whole computer if it detects the USB drive is gone and then will shut itself down after 60 seconds if the USB drive is not reinserted into the target machine. This is to prevent folks from using their copy of Windows to Go at, say, an airport kiosk and then quickly just removing the stick without shutting down the computer—a scenario in which bad actors could then access a logged in corporate desktop. With this feature, the whole computer shuts down rather than leave access open for others. If the USB drive is reinserted within 60 seconds, then operation continues as normal.
  • Access to the Windows Store is disabled by default, but it can be reenabled through a Group Policy object change.


Otherwise, Windows to Go behaves identically to Windows fully installed on a fixed computer. The added convenience is simply that you can unplug the stick and migrate it to any other device in the future.

Deploying Windows to Go

It is not much more work to deploy Windows to Go than it is to release images of any version of Windows these days—your current toolset like DISM and ImageX will work just fine. All you need is the correct USB drive hardware, a Windows Enterprise image, and a Windows Enterprise host computer to write and provision the Windows to Go image to the USB stuck. It is possible to scale this deployment process using some PowerShell scripts so that you can make multiple sticks at once, in case these new regulations have caught you off guard and you need a solution, like, yesterday. There is a very comprehensive guide to deploying Windows to Go USB sticks on TechNet, including these scripts, and I heartily recommend walking through the process so you get a feel for the steps needed to complete the provisioning. ]]

After the sticks are created, you just hand them out to your users and tell them to boot off of the USB. You can see where this will come in handy in these banned laptop scenarios—there is no ban on a USB thumb drive, so you have a couple of options:

  • Take a loaner laptop with you that has no operating system installed at all—in other words, a bare metal laptop. You can allow that to be checked according to the airline’s procedure. When you arrive at your final work destination, plug in your thumb drive, which has never left your possession, and carry on. Of course it could also have a simple installation of Linux or Windows on it; it really does not matter as you would never boot into it.
  • Use Windows to Go in a business center at a hotel or convention center. Since the computer reboots to boot into Windows to Go, you don’t have to be concerned with software keyloggers or other runtime based malware. Of course it is possible for a hardware keylogger to be installed on a keyboard so you must weigh your current acute need for computing access against the threat profile you and your business have identified.
  • Purchase burner equipment at your final destination and return with it or destroy it. You can pick up any cheap laptop at any office supply store and it would be sufficient to run Windows to Go. If you are going to a reasonably populated area, $200-300 can be invested in a cheap laptop into which you can then insert your bootable stick and be off to the races. You can then either bring the laptop home with you or dispose of it—for maximum security this is a good option.


There is currently a list of officially supported Windows to Go USB drives which you can find at the Microsoft website []. I can recommend the IronKey Workspace W300, W500, and W700 options in particular as I have hands on experience with those models, and they have additional security features like boot passwords and self destruction capabilities for hard core security buffs. However, you can use devices that are not officially certified and most likely they will work fine as long as they are USB 3 devices. In fact, one of the officially certified devices—the Kingston DataTraveler—is off of my recommended list because it became scorching hot in my tests after less than an hour of usage in a Windows to Go scenario.

Licensing Windows to Go

Of course the brilliance of this solution technically is obscured by the money grab that is Microsoft licensing, except given recent current events, businesses may have little choice but to pony up for the additional expense.

Windows to Go is part of the Software Assurance program, that bundle of additional benefits and license flexibility that you get by forking over about a 33-40% premium on top of the cost of the license in question. The benefits of SA differ depending on whether your license is for a consumer or a server operating system and also whether this is for server application software or business applications like Office.

For operating systems, Windows to Go is part of the Windows SA benefit package. But of course you also have to decide if you want to license per device or per user. If you are licensed per device with Windows SA, then you can use Windows to Go on any third party device while off site. If you license Windows SA on a per user basis, then you can use Windows to Go on any device. You can also with both methods use Windows to Go on a personally owned device, but not while you are on a corporate campus. (This has to do with roaming benefits, or the ability to take a copy of the software you use at work and put it on your home machine.)

The Last Word

Windows to Go may have been ahead of its time, but it is certainly a competent solution for organizations that have more than a few regular international travelers getting caught up in these recent laptop bans. The great thing about using Windows to Go in these solutions is that it maintains your security profile, is only minimally more inconvenient for your traveler, and is easy to retire if and when these bans are ever lifted. Give it a look.