‘Big data’ is becoming a buzzword in all walks of business life, and legal is no exception – whether that involves supporting your business in its forays into big data or using big data to inform processes in the legal department itself. But what becomes increasingly clear on talking to proponents (or opponents) of this most cutting edge of technologies is that there seems to be no consensus about what it actually is.
Linda NiChualladh is communications, regulation, data protection, EU and competition counsel at state-owned Irish postal services company An Post, which has worked extensively on big data, data warehousing and upskilling staff accordingly. She has thought about this a lot. ‘At one stage, I had 23 definitions of what big data actually means,’ she says. Broadly speaking, it encompasses the accumulation, storage and use of large amounts of data from multiple sources in order to extrapolate trends, make predictions or connections, and generally reveal something that only a comparison of datasets can uncover.
Personal data
Big data is literally big. ‘I’d never heard of the word “petabyte” before I started this project,’ says Nick Maltby, general counsel and company secretary at Genomics England, the company behind the UK’s 100,000 Genomes Project. (A petabyte equals 1,000 terabytes of data, if you’re curious). ‘I went onto Wikipedia and there were pictures of live data centres in the Midwest that would hold 20 petabytes and they were the size of a football pitch.’ The 100,000 Genomes Project has set out to catalogue the genomes of 70,000 NHS patients with rare diseases and cancers, plus those of their families. ‘Your genome is about 120 gigabytes,’ explains Maltby, adding that for each patient with a rare disease, the project would also require the genomes of both parents. Working with this amount of linked, personal data is unprecedented in the UK, he says, and the company has its own private cloud. So as well as the potential clinical and pharmaceutical benefits of such a project – which is hoped to yield advancements in diagnostics, tailored drugs, and even reductions in drug trial times – it has scope for profound technological innovation, as well.
As GC, Maltby is grappling with procurement and contracting, along with the data and IP elements of such an undertaking. Crucially, the data will be commercialised (as pharma and biotech companies are invited to participate in research opportunities), which Maltby has to balance with the need for data protection compliance. ‘It’s sensitive personal data – it’s very difficult to anonymise it,’ he says, especially as a portion of the information includes images. One solution for locking down the data has been to create a reading library data centre rather than a lending library, and to impose limits on the tools that researchers can bring in. But the project must walk the line of facilitating – and not blocking – research while remaining legally compliant. Current legislative frameworks often don’t map these sorts of projects, which were undreamt of at the time that much European law was drafted, and so Maltby has spent a lot of time developing a solid information governance framework, along with obtaining individual and full consent from participants. Luckily, the project enjoys support from the UK public and the press. ‘We’ve had multiple moments where families have finally got a diagnosis, which is what the programme is about,’ he says.
The sound of data
The public often differentiates between ‘good’ and ‘bad’ big data, and the consequences of a breakdown of trust can be problematic for businesses faced with the increased savviness of ordinary people about their data rights. This can be seen from the recent takedown (and subsequent revision) of the ‘Safe Harbour’ agreement, which previously allowed wholesale transfer of data from Europe to the US, following action by an Austrian student. But companies storing big data must also be careful to maintain the goodwill of their commercial clients. This is uppermost in the mind of Åse Lundh Gravenius, general counsel of ICE Services in Sweden – a company that couldn’t have existed in a pre-big data era. ICE runs a database containing information describing 20 million musical works, and shares data about the usage of those works (for example, by radio stations or music streaming services) with musical collecting societies, thus ensuring that royalties go to the right place. The company has the technology to generate more information (and business opportunities) with the data it holds, but Lundh Gravenius is careful not to go outside of the strict contractual provisions agreed with clients. To overstep the mark into data exploitation would be ‘goodwill suicide,’ she says. ‘One of the common mistakes that businesses can make is to be greedy, and to ask clients to hand over too much control.’
Big data ethics go beyond the legality of what one can and can’t do with data, says Lundh Gravenius, and a general counsel must think beyond the law in order to properly protect the organisation. ‘You also need to be mindful of the business and technology environment in which you work. You can’t just stay in your small legal box,’ she explains. She emphasises the importance of dealing with both legal and non-legal issues from the very beginning, rather than trying to manage customer expectations further down the line. Of course, it’s not the in-house counsel’s role to deal directly with external clients, but rather to make sure the rest of the business is prepped to stay compliant. ‘You want your working staff to have some sort of legal backbone, so that they know when things get a bit tricky,’ she advises. ‘Actually training colleagues so that they know when they’re entering potentially dangerous territory is the best investment you can make as in-house counsel in a tech company working with big data.’
Lundh Gravenius believes that lawyers are needed like never before in the big data field, thanks to the IP, data protection, competition and contractual issues that could be present, alongside the nebulous areas where legislation falls short. But with all these factors at play, the GC has to remain adaptable, she stresses. ‘The tech guys can do everything! And it’s so easy to be a nay-sayer when you work with big data. But I don’t think you can be that kind of person, because so often it’s great ideas that you have to foster and nurture and find ways of making them work – without going beyond what you’re allowed to do as a lawyer.’ At An Post, NiChualladh concurs that nothing is risk-free, especially when working with the general public. She advocates a damage programme or a customer response team as a safety net, as well as engaging early on with consumer and advocacy groups to get the right message across ahead of time.
Precision viewing
Not all companies envisage a negative customer response to using data, and at HD PLUS in Germany, GC Michael Zeck is upbeat about the service benefits that customers could receive, which could include tailor-made video-on-demand and TV packages, able to predict viewing preferences with minute-by-minute precision. His organisation is not currently operating in the big data field, but it is planning future products, meaning that he is beginning to consider some of the legal implications. What he foresees are regulatory hurdles, particularly the Revised Data Protection Framework expected in Europe in 2018. He anticipates that lobbying could become a big part of his workload, to counter his concern that regulation could stifle innovation at European companies, enabling big players from the US to swoop in instead. Yet Zeck also recognises the importance of maintaining consumer trust through transparency when it comes to mitigating potential reputational risk. ‘It doesn’t make any sense to be hidden, and then it comes out through journalists,’ he says.
Busting the botnets
The work that André Jahel is doing in the big data field is a little more covert, however. The French lawyer is a counsel at Steering Law Firm, and also a consultant for the digital crimes unit (DCU) of Microsoft. He works with a team analysing the activities of ��botnets’, or networks of ‘infected’ computers that (unknown to their owners) have been set up to forward viruses or spam, or steal financial information. The team has been able to take action through global law enforcement agencies to halt the spread of some malware by seizing servers used by cybercriminals and redirecting the traffic to Microsoft-maintained servers (‘sinkholes’). The team will then work with third parties such as computer emergency response teams and internet service providers to locate the victims. ‘Big data enables us to get informed about the malware, push our investigations further, notify our customers about their compromised IP addresses and help them clean their devices,’ Jahel explains. This means an enhanced understanding of the operating patterns of cybercriminals, and feeds into better-informed digital risk assessments for customers. For example, team members used data visualisation software to create a map of the whereabouts of infected machines. In one case, they noticed that only Western European computers were affected while those based in Eastern Europe were clear, and were able to conclude that the Ukraine and Russia-based cybercriminals behind the attacks were consciously avoiding their own countries, to evade legal action. Here, big data enabled the DCU to team up with law enforcement agencies in over 80 countries to take down a botnet responsible for stealing over $500 million. Another of the team’s big data applications assists in finding photos of child abuse victims by matching online photos against a database containing millions of identified photos. Such images have been previously untraceable because they might have been altered, or the parameters changed to avoid detection.
Jahel is clear that here, big data is being used for a civic purpose. ‘We are looking to provide a safe digital experience in the cloud. So, in addition to the security features that are being implemented in Microsoft products, we take it a step further, by combining big data analytics and legal strategies to help protect end users.’ Of course, this doesn’t mean that the endeavour is risk-free and, to mitigate privacy concerns, no action is taken without receiving an explicit request from the device owner. Also, customers only receive information related to their own network. Compliance guidelines are set by Microsoft at a global level, all data is anonymised, and there is no information-sharing with any third parties.
Top tips for working with big data:
- When you’re in big data, you can’t be isolated from the business. All internal stakeholders must be communicating and aligned to establish a good legal position early on, and build in data protection and privacy by design.
- Take time to properly understand the business environment. Small words in contracts can govern complex technological concepts, meaning that familiarity with technology terminology is essential.
- Never underestimate the fear that big data can evoke among customers and clients. Many legal actions are ‘crusade’ cases, based around principles rather than concrete harm. Linda NiChualladh’s top tip in all matters data comes back to a few words she read several years ago: Remember, ‘data is bits of people’.
Big risk
Work such as that done by Jahel and Maltby demonstrates that there is more to the relationship between in-house lawyers and big data than supporting the data-related commercial activities of the business. But there is also a move towards applying big data solutions to common in-house challenges, like improving efficiency and adding value. For example, an insurance company might benefit from an analysis of trends in litigation to indicate deficiencies in compliance programmes that might be generating large numbers of product failure, or health and safety-related compensation claims. Or an in-house team might want to predict more accurately where to allocate budget for legal spend. Linda NiChualladh has been looking at how companies use big data to market to customers, and includes in-house lawyers within this frame of reference. ‘I see us as almost having to sell our services internally. You want the ability to build up your own expertise in-house, and you want to be able to identify trends so that there’s a reason to have an in-house department,’ she explains. But, for her, the potential usefulness of big data in-house comes with major caveats. Not least of those is the need to avoid a rose-tinted view of what big data actually is. NiChualladh is quick to strip away the glamour: ‘I hear people saying, “we’re going into the cloud!” And I think, do you know that the cloud is an industrial estate near the airport? The entrance to the estate may be a couple of Portacabins with security people and a flickering television. That is the fluffy white cloud.’
She has worked with strategic planning director at Irish advertising agency Target McConnells, Kenneth McKenzie, who has explored the issue from the perspective of marketing professionals, and the pair have concluded that often businesses are ‘sleepwalking’ into using big data. Many are in danger of becoming overexcited about the potential benefits without first having the skills to use the technology properly. And NiChualladh highlights a lack of training as a major risk if in-house departments turn to big data en masse. A self-confessed ‘data nerd’, she takes a wry view of those who can ‘barely open an Excel spreadsheet’ enthusiastically embracing data-driven technology such as artificial intelligence. What is needed, she says, is a thorough grasp of the nuts and bolts of accurate data collection and proper statistical analysis. Otherwise, you are running a gamut of strategic risks, on top of the data protection implications of inept handling of large amounts of personal data: ‘Are in-house legal teams using statisticians and behavioural psychologists to predict behaviour, or are they doing it themselves? I don’t want to have to go back in three years’ time and say my entire strategy for legal services is inappropriate because we misunderstood the data in the first place, or the data we got wasn’t accurate, or the assessments we made on it just didn’t come through.’
As well as ensuring a solid legal footing for all data endeavours, NiChualladh advocates taking a step back and questioning whether you even need the data, and to have a clear sense of what problem you are trying to solve, before diving in. And if you still want to go ahead, she says, you need the expertise – which means knowing and admitting what you don’t know. She foresees a future where legal departments become interdisciplinary environments, working hand in hand with other professional disciplines. Either that, she says, or you must upskill yourself. ‘Lawyers will no longer be able to be mono-disciplined. It won’t take long before you’ll see lawyers who are comfortable modelling with econometrics, working with game theory, and looking at datasets, working within in-house functions.’
Of course, much of this remains in the future for businesses and their in-house lawyers. Kenneth McKenzie argues that even in the world of advertising and marketing it’s early days for big data. ‘We’re told that big data is moulding the day-to-day of marketers, but I think a lot of that is prediction rather than reality. Conversations are happening that make people feel like they need to catch up. There’s a general atmosphere being generated rather than a reality being formed,’ he says. NiChualladh agrees that this is generally the case in the in-house world too, although she is beginning to hear more buzz from law firms. For now, it seems it’s a case of watch this space. But with technology advances, data will be generated from the most unexpected places – and businesses will need their lawyers appropriately skilled and ready to keep them on the straight and narrow.