Raffael Marty, the founder and CEO of PixlCloud, a next-generation data visualisation application for big data, is one of the most influential names in big data, analytics and visualisation.
Having been named in the top 200 thought leaders in big data and analytics by Analytics Week, Marty has also served as chief security strategist with Splunk and was a co-founder of Loggly, a cloud-based log management solution.
For more than 12 years Marty has helped Fortune 500 companies defend themselves against sophisticated adversaries and has trained organisations around the world in the art of data visualisation for security, making him a perfect choice to ask about information security, its past, its present and its future.
What were the paths that you took and the challenges you faced before arriving to where you are now?
Over the past decade I have been pursuing a quest to understand data better- mainly cyber security data. I realized that visualization is a very powerful method to quickly understand data. At some point, I became obsessed with security visualization. When I first started working on visualization and even in 2008 when I published my book (Applied Security Visualization), I ran into quite a lot of resistance and doubt from people.
Some of the resistance stemmed from the fact that a lot of the supporting technologies were not available just yet. Things like big data stores, parallel processing, etc. Some of the resistance I used as a challenge to improve my tools and methods to visualize data. For example, I was obsessed with link graphs for a long time and was trying to map any problem into these graphs. I tried anything to scale graphs and even encode time into them (which turns out to be a hard problem). A lot of approaches that I tried helped me advance my tools, but some of them also failed. Overall, it pushed the envelope of my understanding and my work. At some point I realized that link graphs have their place, but are not the solution for everything.
This simple story bears a very important other lesson. Have you ever come across a company that has some amazing technology, but they don’t solve a real customer problem? I have seen too many of these companies lately. Not a good model to run a company!
How do you see the future of technology and security?
I am hardly a futurist, but it’s not hard to predict that computers will get faster and smaller. Memory is incredibly cheap already and it’s getting cheaper by the month. Processors seems to get faster by the day. This opens up a whole new world of possibilities and insights that we haven’t been able to dream of before. We can now ask really complicated questions from our data that years ago would have taken forever to answer.
This shift in data technology has an interesting impact on security also. We can collect much more data, and one day we might have the technology to make sense of all of that data. I am so bullish on gaining more insight into your data because it doesn’t just help find new attacks and breaches, but it has the side-effect that we learn more about our environments. The companies I work with on security issues get a lot of benefit from understanding their environments better.
In terms of information security, what is being overlooked now that the people should be more aware of?
Pretty much every company I am talking to would like to know more about what is going on in their environment. They are starting to centralize and collect more and more data. Some are deploying log management tools, others are installing security information and event management (SIEM) capabilities. What they are not realizing is that these tools are not really going to magically help them better understand their infrastructure. These tools require a lot of knowledge that needs to be encoded in the tools so that they can find actionable insights.
Slowly we are getting better capabilities that allow analysts to find insights and learn more about their data, but this is still a very immature field.
What is security visualization and why is it very important?
Security visualization is about insights. Insights into your infrastructure, into your assets, your data, and eventually into your business. It does so by presenting the data, for example from log files, in a way that makes it easy for humans to understand and take action on. The field is still very young and we are still working on developing techniques and tools to make this process easier and more valuable for analysts.
What should one do if they want to get into security visualization?
The field of security visualization is a mix between information security and data visualization. Either field is extremely broad and has a large body of knowledge to understand. However, visualization starts with data. It’s a good starting point to look into data; log data. Understand the data and understand how to process so it can be analyzed easier. Then, we need to look at visualization. Things like perception are very important to make effective visualizations. It might not be a surprise that there is a design or an artistic angle to security visualization as well. Unfortunately, we too often forget about that and create technically sound, but incredibly ‘ugly’ visualizations. So start with the data and work your way up.
There are a couple of books (Applied Security Visualization is one of them) on security visualization. These are a good starting point. The data analysis and visualization linux (DAVIX) is a life-CD that has a number of visualization tools readily installed to start experimenting with.
Could you please tell us more about Information Visualization Process?
In order to generate meaningful visualizations that help you find insights, there is a very simple process that leads the analyst from the data to insights. The diagram shows the visualization process. At first look, the process looks very simple and obvious. However, I have seen many failures in visualization because of omitting one of the steps in the process. For example, it seems obvious that we need to define a goal or an objective. But how many times have we taken data and just thrown it into a visualization just to be really frustrated with the output. We might dismiss visualization altogether.
However, if we had a clear goal, we might have realized that we were missing some data to actually answer our initial question. It is perfectly okay to have ‘exploration’ as a goal, but by setting this goal, we make certain decisions on how we visualize the data. There is much more to say about the process, but I think I’ll leave it at this for now. (also see: http://www.networkworld.com/community/node/41856)
How is Security Visualization useful in production?
There are various areas in security organizations where we can benefit from visualization. Starting with reporting, which is not really very interesting. It’s a way to communicate data effectively though. Going on, we can build dashboards that display key metrics (hopefully actionable ones) on a dashboard for operations, security monitoring, or even executives.
However, these are sort of the ‘boring’ areas in visualization. What I personally focus on is highly interactive visual analytics. This is something that security intelligence systems start employing to discover large scale trends or very subtle and hidden patterns. This is where big data and security intelligence really come into play (see http://www.slideshare.net/zrlram/visual-analytics-and-security-intelligence).
What are the common problems in security visualization?
Security visualization is still a very young field and there are a number of problems that I am seeing. I am giving the keynote at the VizSec conference in Seattle where I will be talking quite a bit about this issue. Let me mention just a couple of issues here:
The first one is processing large amounts of heterogeneous data. It is a hard problem to visualize a terabyte of data. Even if we had a way to do this, the other, almost bigger problem is the interpretation of this data. We have a hard time even understanding individual data records. Once we look at a million data records, this problem gets increasingly harder. All of this is made worse by the fact that we don’t really have any tools that would allow us to visualize data for security purposes. Analysts and security engineers generally have to build their own tools to visualize their security data. Again, this is where we are doing a lot of work with Pixlcloud.
About the author: Jay Turla is a security researcher for the InfoSec Institute and one of the goods of ROOTCON (Philippine Hackers Conference).