Enroll Now : Machine learning with Apache SystemML Cognitive Class Certification
Machine learning with Apache SystemML
Module 1 – What is SystemML?
Question1: In machine learning, as analytical models are exposed to new data, they are able to independently adapt. True or false?
- True
- False
Question2: Which of the following are types of alternatives to SystemML?
- R
- MLlib
- Spark R
- Mahout
- All of the above
Question3: The R language was designed for machine learning and works great for big data. True or false?
- True
- False
Module 2 – SystemML and the Spark MLContext
Question1: What the ways you can use SystemML’s Spark MLContext?
- spark-shell
- Through an application using the API
- Through the SystemML console
- A notebook interface
- None of the above
Question 2: You must pass in the reference of the SparkContext to the MLContext constructor. True or false?
- True
- False
Question 3: Why would you use the Spark MLContext?
- Programmatic interface into SystemML’s libraries
- To benefit from the optimizations that come with SystemML
- When you need to convert the data to a binary block matrix
- A and B only
- None of the above
Module 3 – SystemML algorithms
Question1: The Classification algorithm of ensemble learning method that creates a model composed of a set of tree models for classification. True or false?
- True
- False
Question2: K-means is an unsupervised learning algorithm used to assign a category label to each record so that each similar record tend to get the same label. True or false.
- True
- False
Question3: The Kaplan-Meier algorithm predicts how likely it is someone will purchase a product of similar category. True or false?
- True
- False
Module 4 – Declarative Machine Learning (DML)
Question 1: What does DML stand for?
- Data manipulation language
- Data machine language
- Declarative machine learning
- Declarative machine language
Question 2: To run a DML script, which of the following jar file is required at runtime?
- MLContext.jar
- DML.jar
- SystemML.jar
- spark-context.jar
Question 3: Which of the following way to pass command-line arguments is recommended?
- positional arguments
- named arguments
- a comma separated list
- a file
Module 5 – SystemML architecture and optimization
Question 1 : In the ALS performance comparison, at which dataset does the MLlib code run out of memory??
- Large
- Medium
- Small
- None
Question 2 : Which of the following does NOT belong to the SystemML Optimizer stack?
- Create the RDDs for the high level algorithm
- Compute memory estimates
- Generate runtime program
- Live variable analysis
Question 3 : How does SystemML know it is better to run the code on one machine?
- Advanced Rewrites
- Propagation of statistics
- Live variable analysis
- Efficient runtime
- The developer tells it to
Final Exam :
Question 1 : What is machine learning?
- Artificial intelligence for machines to make decisions
- Same as data science to gather insight using machines
- Enable computers to learn without being explicitly programmed
- Learning about how machines operate
Question 2 : What is the purpose of SystemML?
- Programming language for big data
- In-memory analytics engine
- Machine learning for spark
- Machine learning on hadoop
- All of the above
Question 3 : What are the challenges of machine learning on big data using R?
- Programmers are needed to convert the high level code to low level code for parallel computing
- Each iteration of the code takes time to be rewritten and recompile
- Chances for errors are higher during the translation of the algorithms
- All of the above
Question 4 : What is the vision of SystemML?
- Run the same algorithm developed for small data on big data
- Provide flexible algorithm of ML algorithms
- Automatic generation of hybrid runtime plans
- All of the above
Question 5 : Which of the following languages is SystemML most similar?
- R
- Python
- Java
- Scala
- Perl
- R and Python
- Java and Scala
Question 6 : Which of the following line of code will launch the Spark shell with SystemML?
- ./bin/spark-shell –jars SystemML.jar
- ./bin/spark-shell –executor-memory 4G –jars SystemML.jar
- ./bin/spark-shell –driver-memory 4G –jars SystemML.jar
- ./bin/spark-shell –executor-memory 4G –driver-memory 4G –jars SystemML.jar
- All of the above
Question 7 : Why would you convert a DataFrame to a binary-block matrix?
- To enable parallelization within the Spark engine
- To use the rich set of APIs provided by the binary-block matrix
- Allows algorithm performance to be measured separately from data conversion time
- Allows a more efficient runtime processing of the data
Question 8 : Which of the following is TRUE with regards to helper methods in SystemML?
- SystemML’s output is encapsulated in the MLContext object
- SystemML’s output is encapsulated in the MLOutput object
- Helper methods retrieves the values from the MLOutput object
- Helper methods retrieves the values from the MLContext object
- A and D only
- B and C only
Question 9 : Which is NOT a benefit of using SystemML algorithms?
- Run in parallel
- It is faster than all other algorithms
- No need for translation into a lower level language
- Algorithms are optimized based on data and cluster characteristics
Question 10 : Which of the following classes of algorithms provide a recommendation?
- Regression
- Classification
- Matrix Factorization
- Descriptive statistics
Question 11 : Which of the following algorithm can group a set of data into known categories?
- Regression
- Clustering
- Survival Analysis
- Classification
Question 12 : Which of the following algorithm can be used for prediction, forecasting, or error reduction?
- Clustering
- Regression
- Survival Analysis
- Descriptive statistics
Question 13 : Which of the following value typesis NOT supported in the DML language?
- String
- Double
- Varchar
- Boolean
Question 14 : Matrix-vector operations avoids the need for creating replicated matrix for a certain subset of operations. True or false?
- True
- False
Question 15 : Global variables cannot be access within a function. True or false?
- True
- False
Question 16 : Which of the following are NOT types of categories of built-in functions in DML?
- Derivative built-in functions
- Matrix built-in functions
- Statistical built-in functions
- Casting built-in functions
Question 17 : In the statistics propagation phase of the SystemML optimizer, what exactly is happening?
- To determine the confidence level of the computed results
- All the statistics is propagated to the top node to determine the most efficient runtime for query execution
- To determine of probability of the operation succeeding within a given period of time
- Find the widest matrix required and determine if it all fits into the heap.
Question 18 : What is the benefit of doing the matrix rewrite?
- Reduce the line of code needed to represent the matrix
- To determine the confidence level of the computed results
- Clean up and unused memory from the matrix
- To enable parallelization of the given matrixithin a given period of time
- Represent the final matrix without computing the intermediate matrices
Question 19 : Which is NOT part of the SystemML runtime for Spark?
- Automates critical performance decisions
- Distributed vs. local runtime
- Efficient linear algebra optimizations
- Automated RDD caching
- None of the above
Question 20 : SystemML is an Apache open source project. True or false
- True
- False