Apply Here: DataOps Methodology
Module 1: Establish DataOps – Prepare for operation Lesson 2 – Establish Data Strategy
Question 1: Before we can put together a data strategy, we need to have a good understanding of the data available and how it is used in the organization.
- True
- False
Question 2: What is a data strategy?
- An architecture and actionable roadmap along with an action plan
- A competitive publication to show that our organization is modern
- A plan to move all legacy data systems to the cloud
Question 3: Implementing a data strategy should always result in cost savings in the year the plan is realized.
- True
- False
Question 4: Which of the following statements about Data Strategy are ?
- Whatever the type of data, it should only include internally produced data
- All types of data – both structured and unstructured need to be considered
- Volumes of data have increased hugely, but are now starting to stabilize
- Only business executives should be consulted in putting together a strategy
Question 5: Data Governance is a key part of executing a data strategy.
- True
- False
Module 1: Establish DataOps – Prepare for operation Lesson 3 – Establish Team
Question 1: A DataOps team consists of members mostly from IT departments.
- True
- False
Question 2: Which of the following roles are active team members of any DataOps team?
- Chief Technology Officer
- Chief Data Officer
- Data Engineer
- Database Administrator
- Data Steward
- Data Architect
- Data Scientist
Question 3: Creating and maintain business terms is a major responsibility of which following role?
- Data Engineer
- Data Quality Analyst
- Data Steward
- Data Scientist
Question 4: Only Chief Data Officer can update the KPIs for a data sprint.
- True
- False
Question 5: DataOps relies heavily on the use of automation, so that communication among team members is not necessary.
- True
- False
Module 2: Establish DataOps – Optimize for operation Lesson 1 – Establish Toolchain
Question 1: DataOps toolchain helps you deliver quality data slowly.
- True
- False
Question 2: DataOps Toolchain and DevOps are the same thing.
- True
- False
Question 3: DataOps Toolchain can work without DataOps API(s).
- True
- False
Question 4: What are the key components of DataOps Toolchain?
- Continuous Deployment
- Communication
- Source Control
- All of above
Question 5: Who is responsible for creating DataOps Toolchain? (Choose all that apply)
- Data Scientist
- Administrator
- DBA
- Data Engineer
Module 2: Establish DataOps – Optimize for operation Lesson 2 – Establish Baseline
Question 1: Data Management is the same as Information Governance.
- True
- False
Question 2: What is the most costly result from an external influence to an organization?
- Data Breach Fines and Penalties
- Insurance Policy Payout
- Claim Settlement
- None of these
Question 3: Reference data is defined as data used as a permissible value within a data field.
- True
- False
Module 2: Establish DataOps – Optimize for operation Lesson 3 – Establish Business Priorities
Question 1: Business Priority should be the primary focus when deciding what the DataOps team should do.
- True
- False
Question 2: What is a data backlog?
- A bottleneck in the data pipeline
- A list of all data sources
- A prioritized set of requirements expressed as data tasks
- A plan to move all data into a catalog
Question 3: A prioritized data backlog will reduce the time taken to start the next DataOps iteration.
- True
- False
Question 4: A Data Task should be prioritized by considering:
- The cost of providing the data
- The career advancement possibilities of solving business challenges
- The impact to sales from implementing the data pipeline
- All of the above
Question 5: KPIs are used to determine the progress and throughput of a DataOps data sprint.
- True
- False
Module 3: Iterate DataOps – Know your data Lesson 1 – Discover
Question 1: You will need someone on your team with detailed knowledge of the business processes you’re going to analyze so selected data elements are appropriate to reaching your objectives.
- True
- False
Question 2: What should you do if you identify gaps or mismatches in the data required for the analysis?
- Rethink how you will do the analysis with different data
- Create the missing data
- Find a new source for the missing or mismatched data
- All of the above
Question 3: You should trace the linage of data elements to be used for analysis to make sure they come from a trusted source.
- True
- False
Question 4: What is the primary objective of the Discover phase?
- Decide what the analytics team wants to have for lunch
- Identify and locate the specific data elements required to accomplish an analysis
- Uncover the meaning of data column headers and how they relate to the underlying data
- Gain an understanding of the business goals and KPIs of an analysis effort
Question 5: A Data Engineer who thoroughly understands where specific data resides, including the specific databases and files where each identified data element resides, should be involved in Data Discovery process.
- True
- False
Module 3: Iterate DataOps – Know your data Lesson 2 – Classify
Question 1: Classification of each data element will make it easier going forward for users to distinguish the meaning and applicability of the data for their purposes.
- True
- False
Question 2: Which description best defines taxonomy?
- Organizing data elements into meaningful structures
- An IBM network protocol which reduces network latency
- The art of preparing, stuffing, and mounting the skins of animals with lifelike effect
Question 3: A single data element can be placed into an unlimited number of data domains.
- True
- False
Question 4: Which of the following is the objective of classification?
- To bring out points of similarity and dissimilarity among various groups
- To present data in a simple, logical and understandable form
- To condense the mass of data
- All of the above
Question 5: You should design workflows which are specific to the classification tool you are using.
- True
- False
Module 4: Iterate DataOps – Trust your data Lesson 1 – Manage Qualities & Entities
Question 1: Data quality is data accuracy.
- True
- False
Question 2: All data across the enterprise should have the same data quality.
- True
- False
Question 3:A data quality framework consists of which of the following 4 phases:
- Profile
- Define
- Remediate
- Monitor
- Assess
- Deploy
Question 4: When assessing data quality, you only need the data set containing the data, metadata is optional.
- True
- False
Module 4: Iterate DataOps – Trust your data Lesson 2 – Manage Policies
Question 1: How does data classification affect defining policies?
- Inheritance, retention and probabilities
- Protection, reporting and inheritance
- Protection, accessibility and retention
- Retention, deletion and storage
Question 2: What impact does a highly sensitive classification have on a policy definition?
- Require data anonymization, de-identification, and masking
- Limit access to the data and/or require data masking
- Limit access to the data and make it unprintable
- No impact
Question 3: What are the most common state, country or regional regulations affecting personal information?
- SIN, SSN and BAN
- FDIC, BCBS and SOX
- CCPA, GDPR and LGPD
- PCI, PII and PHI
Question 4: Once policies have been defined affecting the data, rules must be enforced to act.
- True
- False
Module 5: Iterate DataOps – Use your data Lesson 1 – Self Service
Question 1: Self Service of data is only possible when any data movement and transformation required to join multiple data assets have been performed.
- True
- False
Question 2: Self Service can use the following governance artefacts to refine a search in a catalog. (Choose all that apply)
- Data Protection Rules
- Business Terms
- Tags
Question 3: A data consumer should not be able to access data that has been identified as sensitive, where there is not a business need to do so.
- True
- False
Question 4: Which of the following statements about Self Service are ?
- Data consumers typically do not know how to manipulate the data
- Data Protection rules prevent a data consumer from inadvertently seeing data that is sensitive
- Creating multiple catalogs can partition data assets by their content and anticipated audience
- A data consumer needs to know SQL to join multiple data assets
Question 5: Data Consumers provide valuable input to data scientists by clarifying the combination of data assets and how they need to be transformed, prior to data movement being designed and implemented.
- True
- False
Module 5: Iterate DataOps – Use your data Lesson 2 – Manage Movement & Integration
Question 1: You should define the use case at the outset of a Data Movement and Integration project to support a “Build It and They Will Come” strategy.
- True
- False
Question 2: Which of the following does not represent a data integration pattern:
- Data virtualization
- Data replication
- Data lineage
- Message-oriented movement
- Bulk/batch
Question 3: Which of the following is not a Data Movement and Integration Job Design consideration?
- Design for reusability
- Deployment models (e.g. Containers, Kubernetes Orchestration, OpenShift)
- Design for parallel processing
- Everything should be programmed in Python
- Design for job portability (build once and run anywhere)
Question 4: Hand coding generally provides a 10X productivity gain over commercial data integration software tooling.
- True
- False
Question 5: Which of the following is not an example of a message queuing system?
- Kafka
- VSAM
- Microsoft Azure Queues
- GCP PubSub
- AWS Simple Queue Service
- MQ
Module 5: Iterate DataOps – Use your data Lesson 3 – Improve/Complete
Question 1: DataOps is a completely new methodology and it doesn’t learn anything from agile and devOps.
- True
- False
Question 2: Data consumers can first start to provide feedback to the current data sprint in the stakeholder review meeting.
- True
- False
Question 3: Which of the following assets or artifacts could be found in catalog?
- Code
- Business terms
- Data rules
- Source data
- Data lineage
Question 4: All issues need to be remediated before moving on to the next data sprint.
- True
- False
Question 5: Completing a data sprint involves publishing governed artifacts and data assets to a production environment.
- True
- False
Module 6: Improve DataOps Review and Refine DataOps
Question 1: DataOps is a fixed process which should not be changed once defined.
- True
- False
Question 2: Improvements to the DataOps process could involve changes to
- Technology used in DataOps
- DataOps team roles and responsibilities
- Processes for ETL
- All of the above
Question 3: Reviewing the Data classification phase involves reviewing how accurate the data mappings to the business terms are.
- True
- False
Question 4: Reviewing the Establish Baseline Process should include reviewing how effective the processes are for establishing a baseline for –
- External Regulatory requirements
- Organization maturity and Readiness
- Governance and Oversight
- All of the above
Question 5: KPIs are key in determining the effectiveness of all parts of the DataOps process.
- True
- False
DataOps Methodology Final Exam Answers
Question 1: What is a data strategy?
- An architecture and actionable roadmap along with an action plan
- A competitive publication to show that our organization is modern
- A plan to move all legacy data systems to the cloud
Question 2: Which of the following statements about Data Strategy are ?
- Whatever the type of data, it should only include internally produced data
- All types of data – both structured and unstructured need to be considered
- Volumes of data have increased hugely, but are now starting to stabilize
- Only business executives should be consulted in putting together a strategy
Question 3: Which of the following roles are active team members of any DataOps team?
- Chief Technology Officer
- Chief Data Officer
- Data Engineer
- Database Administrator
- Data Steward
- Data Architect
- Data Scientist
Question 4: Creating and maintaining business terms is a major responsibility of which following role?
- Data Engineer
- Data Quality Analyst
- Data Steward
- Data Scientist
Question 5: Business Priority should be the primary focus when deciding what the DataOps team should do.
- True
- False
Question 6: What is a data backlog?
- A bottleneck in the data pipeline
- A list of all data sources
- A prioritized set of requirements expressed as data tasks
- A plan to move all data into a catalog
Question 7: A Data Task should be prioritized by considering:
- The cost of providing the data
- The career advancement possibilities of solving business challenges
- The impact to sales from implementing the data pipeline
- All of the above
Question 8: KPIs are used to determine the progress and throughput of a DataOps data sprint.
- True
- False
Question 9: What are key components of DataOps toolchain?
- Continuous Deployment
- Communication
- Source Control
- All of above
Question 10: Who is responsible for creating DataOps toolchain? (Choose all that apply)
- Data Scientist
- Administrator
- DBA
- Data Engineer
Question 11: What is the primary objective of the Discover phase?
- Decide what the analytics team wants to have for lunch.
- Identify and locate the specific data elements required to accomplish an analysis
- Uncover the meaning of data column headers and how they relate to the underlying data.
- Gain an understanding of the business goals and KPIs of an analysis effort.
Question 12: Which description best defines taxonomy?
- Organizing data elements into meaningful structures.
- An IBM network protocol which reduces network latency.
- The art of preparing, stuffing, and mounting the skins of animals with lifelike effect.
Question 13: Which of the following is the objective of classification?
- To bring out points of similarity and dissimilarity among various groups.
- To present data in a simple, logical and understandable form.
- To condense the mass of data.
- All of the above
Question 14: A data quality framework consists of which of the following 4 phases:
- Profile
- Define
- Remediate
- Monitor
- Assess
- Deploy
Question 15: How does data classification affect defining policies?
- Inheritance, retention and probabilities
- Protection, reporting and inheritance
- Protection, accessibility and retention
- Retention, deletion and storage
Question 16: What impact does a highly sensitive classification have on a policy definition?
- Require data anonymization, de-identification, and masking
- Limit access to the data and/or require data masking
- Limit access to the data and make it unprintable
- No impact
Question 17: Self Service can use the following governance artefacts to refine a search in a catalog. (Choose all that apply)
- Data Protection Rules
- Business Terms
- Tags
Question 18: Which of the following statements about Self Service are ?
- A data consumer needs to know SQL to join multiple data assets
- Data Protection rules prevent a data consumer from inadvertently seeing data that is sensitive
- Creating multiple catalogs can partition data assets by their content and anticipated audience
- Data consumers typically do not know how to manipulate the data
Question 19: Which of the following does not represent a data integration pattern:
- Data virtualization
- Data replication
- Data lineage
- Message-oriented movement
- Bulk/batch
Question 20: Which of the following is not a Data Movement and Integration Job Design consideration?
- Design for reusability
- Deployment models (e.g. containers, Kubernetes orchestration, OpenShift)
- Design for parallel processing
- Everything should be programmed in Python
- Design for job portability (build once and run anywhere)
Question 21: Data consumers can first start to provide feedback to the current data sprint in the stakeholder review meeting.
- True
- False
Question 22: Which of the following could be found in catalog?
- Code
- Business terms
- Data rules
- Source data
- Data lineage
Question 23: All issues need to be remediated before moving on to the next data sprint.
- True
- False
Question 24: Improvements to the DataOps process could involve changes to
- Technology used in DataOps
- DataOps team roles and responsibilities
- Processes for ETL
- All of the above
Question 25: Reviewing the Establish Baseline Process should include reviewing how effective are the processes for establishing a baseline for –
- External Regulatory requirements
- Organization maturity and Readiness
- Governance and Oversight
- All of the above