Six Vs of Big Data :-
- The ability to ingest, process and store very large datasets.
- The data can be generated by machine, network, human interactions on system etc.
- The data is measured in petabytes or even Exabyte.
- Speed of data generation and frequency of delivery.
- The data flow is massive and continuous which is valuable to researchers as well as business for decision making for strategic competitive advantages and ROI.
- For processing of data with high velocity tools for data processing known as Streaming analytics were introduced.
- It refers to data from different sources and types which may be structured or understand.
- The unstructured data creates problems for storage, data mining and analyzing the data.
- With the growth of data, even the type of data has been growing fast.
- This refers to establishing if the contextualizing structure of the data stream is regular and dependable even in conditions of extreme unpredictability.
- It defines the need to get meaningful data considering all possible circumstances.
- It refers to the biases, noises and abnormality in data.
- This is where we need to be able to identify the relevance of data and ensure data cleansing is done to only store valuable data.
- Verify that the data is suitable for its intended purpose and usable within the analytic model.
- The data is to be tested against a set of defined criteria.
- Refers to purpose, scenario or business outcome that the analytical solution has to address.
- Does the data have value, if not is it worth being stored or collected?
- The analysis needs to be performed to meet the ethical considerations.
#BigData #AI #ML #Probyto #ProbytoAI
Subscribe and follow us for latest news in Data Science and Machine learning and stay updated!