Spark Developer Certification (HDPCD)

Sample Video

What's Inside

  • 3 hours 45 minutes Training Videos for all exam objectives (100% syllabus covered)
  • Unlimited Access

HDP Certified Developer (HDPCD) Spark Certification

$69.95 $29.95
  • (Limited Period Offer)
  • 100% Syllabus covered: All exam objectives
  • Accessed on PC, Mac, iPhone®, iPad®, Android™ Device

Add to cart

Topic-wise content distribution

Chapter Video Duration
1. Introduction to Series- Spark, SparkCore and SparkSQL 15mins 7s
2. Getting Started with Spark- A General Introduction to Spark 13mins 9s
3. Programming using Spark and Eclipse 23mins 26s
4. Resilient Distributed Datasets-RDD 20mins 16s
5. Fly with RDDs 30mins 15s
6. Key Value Stores/Pair RDDs 37mins 5s
7. Loading and Saving Data using Spark 25mins 41s
8. Accumulators and Broadcastors 13mins 40s
9. SparkSQL-I – SparkSQL using native FileSystems 18mins 17s
10. SparkSQL-II – SparkSQL using Hadoop 27mins 50s

What is Spark Developer Certification (HDPCD)?

The HDPCD Spark Certification is a hands-on, performance-intensive certification for Apache Spark Developers on the Hortonworks Data Platform. Apache Spark is a fast, in-memory data computation engine with expressive APIs to facilitate Data Science, Machine Learning, Streaming applications and providing iterative access. It is an extremely sought out technology that is currently being used by Data Barons such as Samsung, TripAdvisor, Yahoo!, eBay and many others. HDPCD Spark Certified Developers have an edge over the rest of the world because examinees perform a specific number of tasks on a live installation platform provided by Hortonworks rather than simply answering questions. Memorizing and reciting the by-hearted concepts doesn’t work with HDPCD- these are the developers that get work done and the world sees them in a different light altogether.

In this certification, with Spark having an extremely wide application base, Hortonworks recognizes that all tasks to be performed on a live cluster are rather daunting and hence, they mandate that aspirants work only on SparkCore and SparkSQL applications before appearing for this certification. Developers may choose from either Scala and/or Python as the programming language and create applications using them. This Whizlabs course recommends Scala as the preferred Analytics language due to its simple LINQ type syntax.


Do we have multiple choice questions in the Spark Developer (HDPCD) Certification exam?

No, there are no MCQs; instead a live, performance-based test is conducted to gauge the application of concepts. Usually, there are 7-8 tasks provided, out of which a candidate must perform at least 6. The exam is of 2 hours and costs 250 USD per attempt.


How to register for this certification?

Create an account at www.examslocal.com. Once registered and logged in, select “Schedule an Exam“, and then enter “Hortonworks“. In the “Search Here” field to locate and select the Hortonworks Spark Developer (HDPCD:Spark:AP) Certification.


What is the duration of HDPCD Spark Developer Certification exam?

HDPCD Spark Certification is valid for a particular version of Spark. So, for example, if at the time of appearing you worked on Spark v2.2 (current), your certification would hold good until Spark v2.2 is in use.


Who should take this course?

This certification is open for all, i.e. aspirants who wish to make Data Science as their career path should pursue the certification. Going by usual trends, Analysts and Developers, both alike, usually appear for this certification. 


What are the prerequisites for taking this Hadoop Certification Training?

A minimally qualified HDPCD Spark candidate should be aware of the following concepts:

  1. Apache Hadoop – HDFS, YARN, Ambari Hortonworks, Apache Hive and HiveQueryLanguage(HQL)
  2. Scala/Python- Basic programming techniques using Scala or Python- Variables, values and UserDefined Functions.
  3. Knowledge of SQL for JDBC compliant databases is benefecial, not mandatory.

Will I get placement assistance?

In Whizlabs, we are committed to provide world-class training on various technologies. The course content and training materials are created by industry experts who carefully analyzed the market demands. We can assure that when you are completing our self-study training on Big Data Hadoop, you will be able to work on this technology.


What if I have more queries?

If you have any queries related to this course, payments, etc., please feel free to write here. A member of our support staff will respond as soon as possible.


What will you learn in this Spark Developer (HDPCD) Certification Self-Study training Course?

Topic Description Video Duration
1. Introduction to Series- Spark, SparkCore and SparkSQL a) What is Apache Spark and why is the world after it?
b) Topics Covered in the Tutorial- Spark Architecture and Functioning, Spark Core Programming and
SparkSQL Programming on Scala
c) Discussing the Apache Spark Stack
d) Spark users and HDPCD Pre-requisites.
15mins 7s
2. Getting Started with Spark- A General Introduction to Spark a) Spark Application Categories
b) Basic Terminologies used is Spark Programs
c) Installing Spark on a Windows system as as standalone installation
13mins 9s
3. Programming using Spark and Eclipse a) Downloading and installing Eclipse and ScalaIDE
b) Create your first Spark Scala Program
c) Export your project as a JAR and running it using spark-submit
23mins 26s
4. Resilient Distributed Datasets-RDD a) What are RDDs?
b) RDD Operations- Transformations
c) RDD Operations- Actions
c) Taking Transformations and Actions to Eclipse
20mins 16s
5. Fly with RDDs a) Advanced Transformations
b) Advanced Actions
c) Programming using Scala of Advanced RDD Operations
30mins 15s
6. Key Value Stores/Pair RDDs a) What are pair RDDs?
b) Transformations with Pair RDDs
c) Actions with Pair RDDs
d) Partitioning Data
e) Working on data on a per-partition basis
37mins 5s
7. Loading and Saving Data using Spark a) Interacting with Filesystems(NAS/AWS_S3/HDFS/ EXT4/ NTFS )
b) Interacting with databases – JDBC
c) Interacting with HiveContext
25mins 41s
8. Accumulators and Broadcastors a) Accumulators and their execution procedures
b) Broadcastors and their execution proocedures
c) Numeric RDD Operations
13mins 40s
9. SparkSQL-I – SparkSQL using native FileSystems a) Discussing what a Dataset and Dataframe is in Spark
b) Creation of Dataframes from text file data
c) Operations with Dataframes
d) Program involving SparkSQL Dataframes
18mins 17s
10. SparkSQL-II – SparkSQL using Hadoop a) Connecting your program to the Hadoop cluster
b) Creating Datraframes using Hive Tables
c) Comparing Hive vs TextFile creation procedure of dataframes
d) SparkSQL Hive Operations
e) SparkSQL Hive Application
27mins 50s