{"id":41925,"date":"2017-11-06T16:45:59","date_gmt":"2017-11-06T16:45:59","guid":{"rendered":"https:\/\/www.whizlabs.com\/?p=41925"},"modified":"2024-05-13T09:49:47","modified_gmt":"2024-05-13T04:19:47","slug":"best-apache-spark-books","status":"publish","type":"post","link":"https:\/\/www.whizlabs.com\/blog\/best-apache-spark-books\/","title":{"rendered":"10 Best Apache Spark Books"},"content":{"rendered":"<p style=\"text-align: justify;\">Apache Spark is an open-source big data framework from Apache with built-in modules related to SQL, streaming, graph processing, and machine learning. It was open-sourced in 2010, and its impact on big data and related technologies was quite evident from the start as it quickly garnered the attention of 250+ organizations with over 1000 contributors. With so many Apache Spark books available, it is hard to find the best books for self-learning purposes.<\/p>\n<p style=\"text-align: justify;\">So, should you learn it? The answer depends on your interest. If you are heavily invested in big data, then Apache Spark is a must-learn for you as it will give you the necessary tool to succeed in the field. Learning Apache Spark is not easy, until and unless you start learning by <a href=\"https:\/\/www.whizlabs.com\/spark-developer-certification\/\" target=\"_blank\" rel=\"noopener\">online Apache Spark Course<\/a> or reading the best Apache Spark books.<\/p>\n<h2 style=\"text-align: justify;\">Here we created a list of the Best Apache Spark Books<\/h2>\n<h4 style=\"text-align: justify;\">1. Learning Spark: Lightning-Fast Big Data Analysis<\/h4>\n<p style=\"text-align: justify;\">If you already know Python and Scala, then Learning Spark from Holden, Andy, and Patrick is all you need. It is one of the best Apache Spark books for starters as it discusses the Spark fundamentals and architecture. It also explains core concepts such as in-memory caching, interactive shell, <a href=\"https:\/\/www.whizlabs.com\/blog\/spark-rdd\/\" target=\"_blank\" rel=\"noopener\">Spark RDD<\/a>, and distributed datasets.<\/p>\n<figure id=\"attachment_41935\" aria-describedby=\"caption-attachment-41935\" style=\"width: 228px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/1-learing-spark.jpg\"><img decoding=\"async\" class=\"wp-image-41935 size-medium\" src=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/1-learing-spark.jpg\" alt=\"Learning Spark\" width=\"228\" height=\"300\" \/><\/a><figcaption id=\"caption-attachment-41935\" class=\"wp-caption-text\">Learning Spark: https:\/\/covers.oreillystatic.com\/images\/ 0636920028512\/lrg.jpg<\/figcaption><\/figure>\n<p style=\"text-align: justify;\">The book also demonstrates the powerful built-in libraries such as MLib, Spark Streaming, and Spark SQL. As this book is aimed to improve your practical knowledge, it also covers deployment batch, interactive, and streaming applications.<\/p>\n<p>More Details:\u00a0<a href=\"http:\/\/shop.oreilly.com\/product\/0636920028512.do\" target=\"_blank\" rel=\"noopener\">http:\/\/shop.oreilly.com\/product\/0636920028512.do<\/a><\/p>\n<h4 style=\"text-align: justify;\">2. High-Performance Spark: Best Practices for Scaling and Optimizing Apache Spark<\/h4>\n<p style=\"text-align: justify;\">Optimization and scaling are two critical aspects of big data projects. Without these, the application will not be ready for the real world usage. That\u2019s why you need to read the High-Performance Spark from Holden Karau and Rachel Warren. This is one of the best Apache Spark books that discusses the best practices used in optimizing and scaling Apache Spark applications.<\/p>\n<figure id=\"attachment_41936\" aria-describedby=\"caption-attachment-41936\" style=\"width: 228px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/2-high-spark-performance.jpg\"><img decoding=\"async\" class=\"wp-image-41936 size-medium\" src=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/2-high-spark-performance.jpg\" alt=\"High Performance Spark\" width=\"228\" height=\"300\" \/><\/a><figcaption id=\"caption-attachment-41936\" class=\"wp-caption-text\">High Performance Spark: https:\/\/covers.oreillystatic.com\/images\/ 0636920046967\/lrg.jpg<\/figcaption><\/figure>\n<p style=\"text-align: justify;\">The book is aimed at people who already have an existing knowledge of Apache Spark. By using the book, any developer, data engineer or system administrator can save hours of hard work and make the application optimized and scalable.<\/p>\n<p>More Details:\u00a0<a href=\"http:\/\/shop.oreilly.com\/product\/0636920046967.do\" target=\"_blank\" rel=\"noopener\">http:\/\/shop.oreilly.com\/product\/0636920046967.do<\/a><\/p>\n<h4 style=\"text-align: justify;\">3. Mastering Apache Spark<\/h4>\n<p style=\"text-align: justify;\">Mastering Apache Spark is one of the best Apache Spark books that you should only read if you have a basic understanding of Apache Spark. The book covers various Spark techniques and principles. It covers integration with third-party topics such as Databricks, H20, and Titan. The author Mike Frampton uses code examples to explain all the topics. <a href=\"https:\/\/www.whizlabs.com\/blog\/5-best-apache-spark-certification\/\" target=\"_blank\" rel=\"noopener\">Databricks certification<\/a> is among the best Apache Spark certifications, if you want to become a certified Big Data professional, you can go with the Databricks certification.<\/p>\n<figure id=\"attachment_41937\" aria-describedby=\"caption-attachment-41937\" style=\"width: 244px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/3-mastering-apache-spark.jpg\"><img decoding=\"async\" class=\"wp-image-41936 size-medium\" src=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/3-mastering-apache-spark.jpg\" alt=\"High Performance Spark\" width=\"228\" height=\"300\" \/><\/a><figcaption id=\"caption-attachment-41937\" class=\"wp-caption-text\">Mastering Apache Spark: https:\/\/www.packtpub.com\/big-data-and-business-intelligence\/mastering-apache-spark<\/figcaption><\/figure>\n<p style=\"text-align: justify;\">From this book, you will also learn to use new tools for storage and processing, evaluate graph storage, and how Spark can be used in the cloud.<\/p>\n<p>More Details:\u00a0<a href=\"https:\/\/www.packtpub.com\/big-data-and-business-intelligence\/mastering-apache-spark\" target=\"_blank\" rel=\"noopener\">https:\/\/www.packtpub.com\/big-data-and-business-intelligence\/mastering-apache-spark<\/a><\/p>\n<h4 style=\"text-align: justify;\">4. Apache Spark in 24 Hours, Sams Teach Yourself<\/h4>\n<p style=\"text-align: justify;\">Learning a topic in-depth can take a lot of time. However, a practical workplace is fierce and requires new skills to be learned as fast as possible. And, that\u2019s why Sams Teach Yourself series of learning a skill or topic in 24 hours are popular among professionals.<\/p>\n<figure id=\"attachment_41938\" aria-describedby=\"caption-attachment-41938\" style=\"width: 230px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/4-Apache-Spark-24-hours.jpg\"><img decoding=\"async\" class=\"wp-image-41938 size-medium\" src=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/4-Apache-Spark-24-hours.jpg\" alt=\"Apache Spark in 24 Hours\" width=\"230\" height=\"300\" \/><\/a><figcaption id=\"caption-attachment-41938\" class=\"wp-caption-text\">Apache Spark in 24 hours: https:\/\/books.google.co.in\/books? id= sNPvDAAAQBAJ&amp;printsec=frontcover&amp;source=gbs_ ge_summary_r&amp;cad=0#v=onepage&amp;q&amp;f=false<\/figcaption><\/figure>\n<p style=\"text-align: justify;\">Among the list of best Apache Spark books, this book is for complete beginners as it covers everything from simple installation process to the Spark\u2019s architecture. It also covers other topics such as Spark programming, extensions, performance and much more. So, if you want to get an idea of what Apache Spark is, this book is for you.<\/p>\n<p>More Details:\u00a0<a href=\"https:\/\/www.packtpub.com\/big-data-and-business-intelligence\/spark-cookbook\" target=\"_blank\" rel=\"noopener\">https:\/\/www.packtpub.com\/big-data-and-business-intelligence\/spark-cookbook<\/a><\/p>\n<h4 style=\"text-align: justify;\">5. Spark Cookbook<\/h4>\n<p style=\"text-align: justify;\">If you are into production level work, you already know the importance of a cookbook. It can help you close small tasks quickly that are mundane and don\u2019t require much thinking. Spark Cookbook from Rishi Yadav has over 60 recipes on Spark and its related topics. This is one of the best Apache Spark books that covers methods for different types of tasks such as configuring and installing Apache Spark, setting up development environments, building a recommendation engine using MLib, and much more.<\/p>\n<figure id=\"attachment_41939\" aria-describedby=\"caption-attachment-41939\" style=\"width: 245px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/5-spark-cookbook.jpg\"><img decoding=\"async\" class=\"wp-image-41939 size-medium\" src=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/5-spark-cookbook.jpg\" alt=\"Spark Cookbook\" width=\"245\" height=\"300\" \/><\/a><figcaption id=\"caption-attachment-41939\" class=\"wp-caption-text\">Image Source: https:\/\/www.packtpub.com\/big-data-and-business-intelligence\/spark-cookbook<\/figcaption><\/figure>\n<p style=\"text-align: justify;\">Spark Cookbook is primarily aimed at working professionals, and if you want a handy cookbook at your side, this book is for you.<\/p>\n<p>More Details: <a href=\"https:\/\/www.packtpub.com\/big-data-and-business-intelligence\/spark-cookbook\" target=\"_blank\" rel=\"noopener\">https:\/\/www.packtpub.com\/big-data-and-business-intelligence\/spark-cookbook<\/a><\/p>\n<blockquote><p><em>Get <\/em><span style=\"font-family: arial, helvetica, sans-serif;\"><em>50% discount on <a href=\"https:\/\/www.whizlabs.com\/hdpca-certification\/\">HDPCA Course<\/a>: Use <span class=\"il\">coupon<\/span> code <\/em><strong>HADOOP50<\/strong><\/span><\/p><\/blockquote>\n<h4 style=\"text-align: justify;\">6. Apache Spark GraphProcessing<b><\/b><\/h4>\n<p style=\"text-align: justify;\">Apache Spark Graph Processing by Rindra Ramamonjison is aimed at the big data developers and data scientists who are interested in improving their graphing skills while working with big data.<\/p>\n<figure id=\"attachment_41940\" aria-describedby=\"caption-attachment-41940\" style=\"width: 244px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/6-apache-spark-graph-processing.jpg\"><img decoding=\"async\" class=\"wp-image-41940 size-medium\" src=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/6-apache-spark-graph-processing.jpg\" alt=\"Apache Spark Graph Processing\" width=\"244\" height=\"300\" \/><\/a><figcaption id=\"caption-attachment-41940\" class=\"wp-caption-text\">Image Source: https:\/\/www.packtpub.com\/big-data-and-business-intelligence\/apache-spark-graph-processing<\/figcaption><\/figure>\n<p style=\"text-align: justify;\">The first few chapters of the book cover a basic understanding of how you can build, process and analyze graphs. The author then quickly moves to more advanced topics in the later part of the book which covers diverse topics such as implementing graph-parallel iterative algorithms, clustering graphs and much more.<\/p>\n<p>More Details:\u00a0<a href=\"https:\/\/www.packtpub.com\/big-data-and-business-intelligence\/apache-spark-graph-processing\" target=\"_blank\" rel=\"noopener\">https:\/\/www.packtpub.com\/big-data-and-business-intelligence\/apache-spark-graph-processing<\/a><\/p>\n<h4 style=\"text-align: justify;\">7. Advanced Analytics with Spark: Patterns for learning from Data at Scale<\/h4>\n<p style=\"text-align: justify;\">Advanced Analytics with Spark will not only get you familiar with the Spark programming model but also its ecosystem, general approaches in data science and much more. This book by Sandy, Uri, Sean, and Josh is aimed at data scientists and developers who are interested in learning advanced techniques that work with large-scale data analytics.<\/p>\n<figure id=\"attachment_41941\" aria-describedby=\"caption-attachment-41941\" style=\"width: 228px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/7-advanced-analytics-with-spark.jpg\"><img decoding=\"async\" class=\"wp-image-41941 size-medium\" src=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/7-advanced-analytics-with-spark.jpg\" alt=\"Advanced Analytics with Spark\" width=\"228\" height=\"300\" \/><\/a><figcaption id=\"caption-attachment-41941\" class=\"wp-caption-text\">Image Source: https:\/\/covers. oreillystatic.com\/images\/0636920035091\/lrg.jpg<\/figcaption><\/figure>\n<p style=\"text-align: justify;\">The book starts with a basic introduction to Spark\u2019s ecosystem to ensure that the learning curve is not exponential. The later chapters cover how you can apply different patterns using techniques such as collaborative filtering, clustering classification, and anomaly detection. This book is very useful and handy for one who is working in the field of security, genomics, and finance.<\/p>\n<p>More Details:\u00a0<a href=\"http:\/\/shop.oreilly.com\/product\/0636920035091.do\" target=\"_blank\" rel=\"noopener\">http:\/\/shop.oreilly.com\/product\/0636920035091.do<\/a><\/p>\n<h4 style=\"text-align: justify;\">8. Spark: The Definite Guide: Big Data Processing Made Simple<\/h4>\n<p style=\"text-align: justify;\">I don\u2019t recommend books that are yet to reach the market, but this book deserves mention. The book, \u201cSpark: The Definite Guide,\u201d is written is by Bill Chambers and Matei Zaharia and is published by O\u2019Reilly.<\/p>\n<figure id=\"attachment_41942\" aria-describedby=\"caption-attachment-41942\" style=\"width: 229px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/8-spark-definitive-guide.jpg\"><img decoding=\"async\" class=\"wp-image-41942 size-medium\" src=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/8-spark-definitive-guide.jpg\" alt=\"Spark: The Definitive Guide\" width=\"229\" height=\"300\" \/><\/a><figcaption id=\"caption-attachment-41942\" class=\"wp-caption-text\">Image Source: The Definitive Guide: http:\/\/shop.oreilly.com\/product\/0636920034957.do<\/figcaption><\/figure>\n<p style=\"text-align: justify;\">The initial impressions of the book look good. Also, if you go through the topics covered in the book, you will see how the book covers almost every aspect of Apache Spark. The book is primarily aimed at beginners and covers almost every single aspect of the Apache.<\/p>\n<p>More Details:\u00a0<a href=\"http:\/\/shop.oreilly.com\/product\/0636920034957.do\" target=\"_blank\" rel=\"noopener\">http:\/\/shop.oreilly.com\/product\/0636920034957.do<\/a><\/p>\n<h4 style=\"text-align: justify;\">9. Spark GraphX in Action<\/h4>\n<p style=\"text-align: justify;\">Without visuals, it is next to impossible to convince anyone in the marketing field. GraphX is a graph processing API that works over Spark and gives you the tool to create graphs that convey messages. It is one of the most advanced and useful API for graphical needs. The book covers practical examples of machine learning and graph processing.<\/p>\n<figure id=\"attachment_41943\" aria-describedby=\"caption-attachment-41943\" style=\"width: 238px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/9-spark-graphx-in-action.jpg\"><img decoding=\"async\" class=\"wp-image-41943 size-medium\" src=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/9-spark-graphx-in-action.jpg\" alt=\"Spark Graph X in Action\" width=\"238\" height=\"300\" \/><\/a><figcaption id=\"caption-attachment-41943\" class=\"wp-caption-text\">Image Source: https:\/\/www.manning.com\/books\/spark-graphx-in-action<\/figcaption><\/figure>\n<p style=\"text-align: justify;\">As GraphX library is a popular library, it is covered in almost all the books we have mentioned in this article. However, none of them covers the library in-depth. So, if you are looking to improve your GraphX knowledge or graphs in general, give this book a read, and you will not be disappointed.<\/p>\n<p>More Details:\u00a0<a href=\"https:\/\/www.manning.com\/books\/spark-graphx-in-action\" target=\"_blank\" rel=\"noopener\">https:\/\/www.manning.com\/books\/spark-graphx-in-action<\/a><\/p>\n<h4 style=\"text-align: justify;\">10. Big Data Analytics with Spark<\/h4>\n<p style=\"text-align: justify;\">Big Data Analytics with Spark is yet another one of the best Apache Spark books aimed at beginners. It starts off gently and then focuses on useful topics such as Spark-streaming and Spark SQL. This book is an excellent choice for one who wants a high-level view of the Spark\u2019s ecosystem.<\/p>\n<figure id=\"attachment_41945\" aria-describedby=\"caption-attachment-41945\" style=\"width: 206px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/10-big-analytics-with-spark.jpg\"><img decoding=\"async\" class=\"wp-image-41945 size-medium\" src=\"https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2024\/05\/10-big-analytics-with-spark.jpg\" alt=\"Big Data Analytics with Spark\" width=\"206\" height=\"300\" \/><\/a><figcaption id=\"caption-attachment-41945\" class=\"wp-caption-text\">Image Source: http:\/\/www.apress.com\/us\/book\/9781484209653<\/figcaption><\/figure>\n<p>More Details:\u00a0<a href=\"http:\/\/www.apress.com\/us\/book\/9781484209653\" target=\"_blank\" rel=\"noopener\">http:\/\/www.apress.com\/us\/book\/9781484209653<\/a><\/p>\n<p style=\"text-align: justify;\"><strong><em>Whizlabs Big Data Certification courses &#8211; <a href=\"https:\/\/www.whizlabs.com\/spark-developer-certification\/\" target=\"_blank\" rel=\"noopener\">Spark Developer Certification (HDPCD)<\/a> and <a href=\"https:\/\/www.whizlabs.com\/hdpca-certification\/\" target=\"_blank\" rel=\"noopener\">HDP Certified Administrator (HDPCA)<\/a>\u00a0are based on the Hortonworks Data Platform, a market giant of Big Data platforms. Whizlabs recognizes that interacting with data and increasing its comprehensibility is the need of the hour and hence, we are proud to launch our <a href=\"https:\/\/www.whizlabs.com\/big-data-certifications\/\">Big Data Certifications<\/a>. We have created state-of-the-art content that should aid data developers and administrators to gain a competitive edge over others.<\/em><\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Apache Spark is an open-source big data framework from Apache with built-in modules related to SQL, streaming, graph processing, and machine learning. It was open-sourced in 2010, and its impact on big data and related technologies was quite evident from the start as it quickly garnered the attention of 250+ organizations with over 1000 contributors. With so many Apache Spark books available, it is hard to find the best books for self-learning purposes. So, should you learn it? The answer depends on your interest. If you are heavily invested in big data, then Apache Spark is a must-learn for you [&hellip;]<\/p>\n","protected":false},"author":220,"featured_media":42494,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_uag_custom_page_level_css":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"default","adv-header-id-meta":"","stick-header-meta":"default","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[6],"tags":[152,154,386,422,872,1029,1060,1475],"class_list":["post-41925","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-big-data","tag-apache-spark","tag-apache-spark-books","tag-best-apache-spark-books","tag-big-data","tag-hdpcd","tag-learning-apache-spark","tag-mastering-apache-spark","tag-spark-developer"],"uagb_featured_image_src":{"full":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",560,315,false],"thumbnail":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104-150x150.jpg",150,150,true],"medium":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104-300x169.jpg",300,169,true],"medium_large":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",560,315,false],"large":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",560,315,false],"1536x1536":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",560,315,false],"2048x2048":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",560,315,false],"profile_24":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",24,14,false],"profile_48":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",48,27,false],"profile_96":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",96,54,false],"profile_150":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",150,84,false],"profile_300":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",300,169,false],"tptn_thumbnail":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104-250x250.jpg",250,250,true],"web-stories-poster-portrait":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",560,315,false],"web-stories-publisher-logo":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",96,54,false],"web-stories-thumbnail":["https:\/\/www.whizlabs.com\/blog\/wp-content\/uploads\/2017\/11\/104.jpg",150,84,false]},"uagb_author_info":{"display_name":"Aditi Malhotra","author_link":"https:\/\/www.whizlabs.com\/blog\/author\/aditi\/"},"uagb_comment_info":5,"uagb_excerpt":"Apache Spark is an open-source big data framework from Apache with built-in modules related to SQL, streaming, graph processing, and machine learning. It was open-sourced in 2010, and its impact on big data and related technologies was quite evident from the start as it quickly garnered the attention of 250+ organizations with over 1000 contributors.&hellip;","_links":{"self":[{"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/posts\/41925","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/users\/220"}],"replies":[{"embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/comments?post=41925"}],"version-history":[{"count":5,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/posts\/41925\/revisions"}],"predecessor-version":[{"id":95644,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/posts\/41925\/revisions\/95644"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/media\/42494"}],"wp:attachment":[{"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/media?parent=41925"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/categories?post=41925"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/tags?post=41925"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}