Introduction To Big Data

Big Data” and its technologies are the “new kid” on the block (atleast, for most of us!) and this post will seek to explain the fundamental details of Big Data.

Communication is at all time high today and data is generated through cell phones, televisions, credit cards, airplanes, social media, flights, trains and so on. Big data has become a huge factor in the last three years and “Data scientists” are in huge demand and most of them command mind boggling salaries.

Necessity is the mother of invention” goes the saying and the necessity to process, analyze and store huge amounts of data in the social media age has paved the way for the creation of the ‘Big Data’ and its technologies like open source project ‘Hadoop’.

Issues With Traditional Systems:

We all remember the days of database systems, normalization and the different ‘normal forms’. These traditional systems falter with the current avalanche of data that is being generated. Setting up legacy systems, maintaining them and scaling problems are the crucial reasons why Big Data and its technologies are gaining importance today. Traditional systems also need a common data format – they cannot process pictures, emails, DBMS data etc all in the same place.

How Much Data Is Being Generated?

According to a report from Webopedia, 2.5 quintillion bytes of data are being generated every day. Another not so astonishing fact – most of this data has been generated in the last 2 years alone. Among other statistics :

  • 51% of the data that is generated is structured
  • there are a billion social media posts every two days
  • 27% of data that is generated is unstructured
  • 22% of data is semi-structured
  • By 2015, 4.4 million jobs will be created which will be related to Big Data

What Is Structured Data, Unstructured Data And Semi-structured Data?

Typically, all data used to be “structured” a few years ago. With the passage of time and advancements in technology, unstructured data and semi-structured data crept in.

Data in RDBMS (Relational Data Base Systems) and spreadsheet is structured data. Structured data is anything that follows a common format and can be grouped into “chunks”.

Anything that cannot be put into rows and columns is “unstructured data”.

From the picture below, we can understand that unstructured data does not follow any particular pattern and cannot be grouped into “chunks”. They vary and examples include videos, emails, Wikis, PPTs, Word files and so on. (Semi-Structured Data)

Semi structured data is anything between structured data and unstructured data. Semi-structured data is neither raw data (like videos, emails) nor data that is laid neatly in columns and rows. It is a type of structured data that has tags and other elements to identify it.

BigData1Big Data And Organizations:

Analyzing complex data and turning it into profits is one of the reasons why organizations embrace Big Data technologies. Companies like GE, UPS are all embracing Big Data technologies to optimize their businesses. “GE estimates that a 1% fuel reduction in the use of big data from aircraft engines would result in a $30 billion savings for the commercial airline industry over 15 years”. (Big Data in Big Companies)

Big Data will hold its magic over all the segments of a business including retail, healthcare in the coming years.


15 Important Big Data Facts for IT Professionals. (2014, Feb 4). Retrieved from

Big Data in Big Companies. (n.d.). Retrieved from International Institute for Analytics:

Semi-Structured Data. (n.d.). Retrieved from

About Aditi Malhotra

Aditi Malhotra is the Content Marketing Manager at Whizlabs. Having a Master in Journalism and Mass Communication, she helps businesses stop playing around with Content Marketing and start seeing tangible ROI. A writer by day and a reader by night, she is a fine blend of both reality and fantasy. Apart from her professional commitments, she is also endearing to publish a book authored by her very soon.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top