What is a Big Data
Extremely large collections of information (data sets) that will be analysed to reveal patterns, trends, and associations, especially referring to human behaviour and interactions.
What is Data?
The quantities, characters, or symbols on which operations are performed by a computer, which can be stored and transmitted within the variety of electrical signals and recorded on magnetic, optical, or mechanical recording media.
What is big data?
There are many definitions of the term ‘big data’ but most suggest something just like the following:
‘Extremely large collections of information (data sets) that will be analysed to reveal patterns, trends, and associations, especially referring to human behaviour and interactions.’
Getting programs on multiple machines to figure together in an efficient way so each program knows which components of the data to process, and so having the ablility to place the results from all the machines together to form sense of an outsized pool of data, takes special programming techniques. Since it’s typically much faster for programs to access data stored locally rather than over a network, the distribution of data across a cluster and the way those machines are networked together are also important considerations when pondering big data problems.
Sources of big data
Main sources of big data will be grouped under the headings of social (human), machine (sensor) and transactional.
Social (human) – this source is becoming more and more relevant to organizations. This source includes all social media posts, videos posted etc.
Machine (sensor) – this data comes from what are often measured by the equipment used.
Transactional – this comes from the transactions which are undertaken by the organization. This can be perhaps the foremost traditional of the sources.
Characteristics of big data
The characteristics of big data, called the 5Vs, are:
Volume
Variety
Velocity
Veracity
Value
These characteristics are generally adopted because the essential qualities of big data.
Volume
The volume of big data held by large companies like Walmart (supermarkets), Apple and EBay is measured in multiple petabytes. A typical disc on a private computer (PC) holds a gigabyte, therefore, the big data depositories of those companies hold a minimum of the data that could typically be persisted 1 million PCs, perhaps even 10 to 20 million PCs.
Variety
Some of the variability of information may be seen from the examples listed. Specifically, the subsequent forms of information are held:
Browsing activities: sites, pages visited, membership of websites, downloads, searches
Financial transactions
Interests
Buying habits
Reaction to advertisements on the net or to advertising emails
Geographical information
Information about social and business contacts
Text
Numerical information
Graphical information (such as photographs)
Oral information (such as voice mails)
Jet engine vibration and temperature analysis are examples of technical data.
Velocity
In order to be useful in decision-making and performance management, information must be supplied fast. For instance, within the above store scenario, there would be little use in obtaining the price-comparison information and texting customers once that they had left the shop. If face recognition goes to be used by shops and hotels, it’s to be more or less instant so guests may be welcomed by name.
Veracity
Veracity means accuracy and truthfulness and relates to the standard of the data. Within the context of big data, for any analysis to supply useful findings for deciding, the data collected must be true. To assess how true the data collected is, companies must consider not only how accurate or reliable a data set may well be but also how trusted is that the source of the data. Companies must be able to trust the source of the data being collected and be confident that the data is reliable and accurate if they are to base important and often costly decisions on the findings of its analysis.
Value
The last V of big data is Value. There’s little point in visiting the hassle and expense of gathering and analysing the data if this doesn’t ultimately lead to adding value to the company. It is important for companies to think about the potential of big data analytics and therefore, the value it could create if gathered, analysed and used wisely.
An example of how data analysis was utilized by British supermarket group Tesco to feature value:
Processing and analysing Big data
The processing of big data is mostly called as big data analytics and includes:
Data mining
Analyzing data to spot patterns and establish relationships like associations, sequences and correlations.
Predictive analytics
A kind of data mining which aims to predict future events. For instance, the possibility of somebody is being persuaded to upgrade a flight.
Text analytics
Scanning text like emails and word processing documents to extract useful information. It could simply be searching for key words that indicate an interest during a product or place.
Voice analytics
As above but with audio,
Statistical analytics
It is used to identify trends, correlations and changes in behavior.
Big Data Pyramids
In 1989, Askoff’s study gave birth to the DIKW pyramid, commonly known as the knowledge pyramid. With the emergence of big data, the pyramid has also become called as the big data pyramid.
Rowley explained the pyramid: ‘Typically information is defined in terms of data, knowledge in terms of information, and wisdom in terms of knowledge.’
Data
A spread of data can be collected from various sources – this can be data and not particularly useful during this form.
Information
The raw data are often analyzed to seem for trends or patterns, for instance it’s going to appear that there’s a link between the purchase of a particular product and a particular group of customers. This can be information.
Knowledge
The knowledge may be analyzed further to ascertain how the identified links are connected. Knowing the main points of exactly what varieties of customers buy a selected product or favor particular product features is knowledge.
Wisdom
The knowledge gathered are often wont to make informed business decisions.