{"id":9988,"date":"2014-09-30T10:00:08","date_gmt":"2014-09-30T10:00:08","guid":{"rendered":"https:\/\/www.whizlabs.com\/?p=9988"},"modified":"2024-05-21T17:53:32","modified_gmt":"2024-05-21T12:23:32","slug":"introduction-to-big-data","status":"publish","type":"post","link":"https:\/\/www.whizlabs.com\/blog\/introduction-to-big-data\/","title":{"rendered":"Introduction To Big Data"},"content":{"rendered":"<p style=\"text-align: justify;\">Big Data\u201d and its technologies are the \u201cnew kid\u201d on the block (atleast, for most of us!) and this post will seek to explain the fundamental details of Big Data.<\/p>\n<p style=\"text-align: justify;\">Communication is at all time high today and data is generated through cell phones, televisions, credit cards, airplanes, social media, flights, trains and so on. Big data has become a huge factor in the last three years and \u201cData scientists\u201d are in huge demand and most of them command mind boggling salaries.<\/p>\n<p style=\"text-align: justify;\">Necessity is the mother of invention\u201d goes the saying and the necessity to process, analyze and store huge amounts of data in the social media age has paved the way for the creation of the \u2018Big Data\u2019 and its technologies like open source project \u2018Hadoop\u2019.<\/p>\n<h4 style=\"text-align: justify;\"><b style=\"line-height: 1.5em;\">Issues With Traditional Systems:<br \/>\n<\/b><\/h4>\n<p style=\"text-align: justify;\">We all remember the days of database systems, normalization and the different \u2018normal forms\u2019. These traditional systems falter with the current avalanche of data that is being generated. Setting up legacy systems, maintaining them and scaling problems are the crucial reasons why Big Data and its technologies are gaining importance today. Traditional systems also need a common data format \u2013 they cannot process pictures, emails, DBMS data etc all in the same place.<\/p>\n<h4 style=\"text-align: justify;\"><b>How Much Data Is Being Generated?<\/b><\/h4>\n<p style=\"text-align: justify;\">According to a report from Webopedia, 2.5 quintillion bytes of data are being generated every day. Another not so astonishing fact \u2013 most of this data has been generated in the last 2 years alone. Among other statistics :<\/p>\n<ul style=\"text-align: justify;\">\n<li>51% of the data that is generated is structured<\/li>\n<li>there are a billion social media posts every two days<\/li>\n<li>27% of data that is generated is unstructured<\/li>\n<li>22% of data is semi-structured<\/li>\n<li>By 2015, 4.4 million jobs will be created which will be related to Big Data<\/li>\n<\/ul>\n<h4 style=\"text-align: justify;\"><b>What Is Structured Data, Unstructured Data And Semi-structured Data?<br \/>\n<\/b><\/h4>\n<p style=\"text-align: justify;\"><span style=\"line-height: 1.5em;\">Typically, all data used to be \u201cstructured\u201d a few years ago. With the passage of time and advancements in technology, unstructured data and semi-structured data crept in.<\/span><\/p>\n<h5 style=\"text-align: justify;\"><b style=\"line-height: 1.5em;\"><i>Data in RDBMS (Relational Data Base Systems) and spreadsheet is structured data. Structured data is anything that follows a common format and can be grouped into \u201cchunks\u201d.<\/i><\/b><\/h5>\n<p style=\"text-align: justify;\"><span style=\"line-height: 1.5em;\">Anything that cannot be put into rows and columns is \u201cunstructured data\u201d.<\/span><\/p>\n<p style=\"text-align: justify;\">From the picture below, we can understand that unstructured data does not follow any particular pattern and cannot be grouped into \u201cchunks\u201d. They vary and examples include videos, emails, Wikis, PPTs, Word files and so on.\u00a0(Semi-Structured Data)<\/p>\n<p style=\"text-align: justify;\">Semi structured data is anything between structured data and unstructured data. Semi-structured data is neither raw data (like videos, emails) nor data that is laid neatly in columns and rows. It is a type of structured data that has tags and other elements to identify it.<\/p>\n<h4 style=\"text-align: justify;\"><span style=\"line-height: 1.5em;\"><img decoding=\"async\" class=\"aligncenter size-full wp-image-9990\" src=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/BigData.jpg\" alt=\"BigData1\" width=\"790\" height=\"536\" \/><b>Big Data And Organizations:<br \/>\n<\/b><\/span><\/h4>\n<p style=\"text-align: justify;\">Analyzing complex data and turning it into profits is one of the reasons why organizations embrace Big Data technologies. Companies like GE, UPS are all embracing Big Data technologies to optimize their businesses. \u201cGE estimates that a 1% fuel reduction in the use of big data from aircraft engines would result in a $30 billion savings for the commercial airline industry over 15 years\u201d.\u00a0(Big Data in Big Companies)<\/p>\n<p style=\"text-align: justify;\">Big Data will hold its magic over all the segments of a business including retail, healthcare in the coming years.<\/p>\n<h4 style=\"text-align: justify;\">Bibliography<\/h4>\n<p style=\"text-align: justify;\"><i>15 Important Big Data Facts for IT Professionals<\/i>. (2014, Feb 4). Retrieved from Webopedia.com: http:\/\/www.webopedia.com\/quick_ref\/important-big-data-facts-for-it-professionals.html<\/p>\n<p style=\"text-align: justify;\"><i>Big Data in Big Companies<\/i>. (n.d.). Retrieved from International Institute for Analytics: http:\/\/www.sas.com\/content\/dam\/SAS\/en_us\/doc\/whitepaper2\/bigdata-bigcompanies-106461.pdf<\/p>\n<p style=\"text-align: justify;\"><i>Semi-Structured Data<\/i>. (n.d.). Retrieved from http:\/\/www.dcs.bbk.ac.uk\/~ptw\/teaching\/ssd\/notes.html<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Big Data\u201d and its technologies are the \u201cnew kid\u201d on the block (atleast, for most of us!) and this post will seek to explain the fundamental details of Big Data. Communication is at all time high today and data is generated through cell phones, televisions, credit cards, airplanes, social media, flights, trains and so on. Big data has become a huge factor in the last three years and \u201cData scientists\u201d are in huge demand and most of them command mind boggling salaries. Necessity is the mother of invention\u201d goes the saying and the necessity to process, analyze and store huge [&hellip;]<\/p>\n","protected":false},"author":220,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_uag_custom_page_level_css":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"default","adv-header-id-meta":"","stick-header-meta":"default","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[6],"tags":[422,830,1451,1499,1575],"class_list":["post-9988","post","type-post","status-publish","format-standard","hentry","category-big-data","tag-big-data","tag-hadoop","tag-semi-structured-data","tag-structured-data","tag-unstructured-data"],"uagb_featured_image_src":{"full":false,"thumbnail":false,"medium":false,"medium_large":false,"large":false,"1536x1536":false,"2048x2048":false,"profile_24":false,"profile_48":false,"profile_96":false,"profile_150":false,"profile_300":false,"tptn_thumbnail":false,"web-stories-poster-portrait":false,"web-stories-publisher-logo":false,"web-stories-thumbnail":false},"uagb_author_info":{"display_name":"Aditi Malhotra","author_link":"https:\/\/www.whizlabs.com\/blog\/author\/aditi\/"},"uagb_comment_info":52,"uagb_excerpt":"Big Data\u201d and its technologies are the \u201cnew kid\u201d on the block (atleast, for most of us!) and this post will seek to explain the fundamental details of Big Data. Communication is at all time high today and data is generated through cell phones, televisions, credit cards, airplanes, social media, flights, trains and so on.&hellip;","_links":{"self":[{"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/posts\/9988","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/users\/220"}],"replies":[{"embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/comments?post=9988"}],"version-history":[{"count":2,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/posts\/9988\/revisions"}],"predecessor-version":[{"id":96340,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/posts\/9988\/revisions\/96340"}],"wp:attachment":[{"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/media?parent=9988"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/categories?post=9988"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/tags?post=9988"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}