How much data do I need to have Big Data?
Luke Westby

Luke Westby @lukewestby

About: Elm core contributor. Creator and former maintainer of ellie-app.com. elm-conf organizing team alumnus.

Joined:
Dec 29, 2016

How much data do I need to have Big Data?

Publish Date: Aug 1 '19
7 4

Comments 4 total

  • Pierre Bouillon
    Pierre BouillonAug 1, 2019

    I may be mistaking but I think that you can say you are doing 'Big Data' whenever you have to process more data than your computer can handle.

    When I did some researched on it I was amused to see that, for example, a huge CSV file to treat could be considered 'Big Data' if you are working on a veeeery old computer 😄

    • Luke Westby
      Luke WestbyAug 1, 2019

      Okay interesting, so big is relative to your ability to process it. Does that refer specifically to compute instances from cloud providers, or could it be, like, a laptop?

      • Pierre Bouillon
        Pierre BouillonAug 1, 2019

        Yes ! I found it quiet funny

        And yes, from what I understand it depends on where you are performing your treatment. So, if you're using the cloud then it's relative to the servers you're using

  • Hakki
    HakkiAug 1, 2019

    Big data can be described by the following characteristics

    Volume
    The quantity of generated and stored data. The size of the data determines the value and potential insight, and whether it can be considered big data or not.

    Variety
    The type and nature of the data. This helps people who analyze it to effectively use the resulting insight. Big data draws from text, images, audio, video; plus it completes missing pieces through data fusion.

    Velocity
    In this context, the speed at which the data is generated and processed to meet the demands and challenges that lie in the path of growth and development. Big data is often available in real-time. Compared to small data, big data are produced more continually. Two kinds of velocity related to big data are the frequency of generation and the frequency of handling, recording, and publishing.

    Veracity
    It is the extended definition for big data, which refers to the data quality and the data value. The data quality of captured data can vary greatly, affecting the accurate analysis.

    Data must be processed with advanced tools (analytics and algorithms) to reveal meaningful information. For example, to manage a factory one must consider both visible and invisible issues with various components. Information generation algorithms must detect and address invisible issues such as machine degradation, component wear, etc. on the factory floor.

    Source: wikipedia

Add comment