Dealing with Data: Big Data


Posted on 20 February 2017 | Alasdair Rutherford

This video asks "what is ‘Big data’?" Big data has many definitions but it is best understood by its key characteristics:


• volume – this is a relative concept, as what is considered big to your organisation may be different to what Google considers big for example. The common link is that the size of the data makes it challenging to work with.
• variety – refers to the different types of information held in ‘big’ datasets. There may be a mix of quantitative and qualitative information, or data gathered from multiple sources and linked together.
• velocity – big data is often generated at a rapid pace e.g. tweets, credit card transactions. Even an organisation’s administrative data can be updated continuously e.g. HMRC PAYE records.
• veracity – big data can be messy in terms of its structure and quality.
• value – utilising big data should lead to identifiable value for the organisation; don’t get caught up in the hype and leverage big data unthinkingly.

The different types of big data will be familiar to many and are often categorised as follows:
• Human-sourced information – e.g. social media content.
• Traditional business systems – e.g. administrative data generated by an organisation.
• Machine-generated data – e.g. information recorded by sensors in the home.

Alasdair also addresses some of the significant challenges in working with big data and specifically highlights the ethical issues that arise (e.g. privacy, confidentiality and anonymity).