Unlock your Full Professional-Cloud-DevOps-Engineer Google Stable Exam

Google Cloud Certified - Professional Cloud DevOps Engineer Exam Questions and Answers

Question 1

You have an application deployed to Cloud Run. A new version of the application has recently been deployed using the canary deployment strategy. Your Site Reliability Engineering (SRE) teammate informs you that an SLO has been exceeded for this application. You need to make the application healthy as quickly as possible. What should you do first?

Options:

Configure traffic splitting to send 100% of the traffic to the latest revision.

Configure traffic splitting to send 100% of the traffic to the previous revision.

Create a new revision using the last known good version of the application.

Identify the cause of the latency by using Cloud Trace.

Question 2

You work for a global organization and run a service with an availability target of 99% with limited engineering resources. For the current calendar month you noticed that the service has 99 5% availability. You must ensure that your service meets the defined availability goals and can react to business changes including the upcoming launch of new features You also need to reduce technical debt while minimizing operational costs You want to follow Google-recommended practices What should you do?

Options:

Add N+1 redundancy to your service by adding additional compute resources to the service

Identify, measure and eliminate toil by automating repetitive tasks

Define an error budget for your service level availability and minimize the remaining error budget

Allocate available engineers to the feature backlog while you ensure that the sen/ice remains within the availability target

Question 3

You need to build a CI/CD pipeline for a containerized application in Google Cloud Your development team uses a central Git repository for trunk-based development You want to run all your tests in the pipeline for any new versions of the application to improve the quality What should you do?

Options:

1. Install a Git hook to require developers to run unit tests before pushing the code to a central repository2. Trigger Cloud Build to build the application container Deploy the application container to a testing environment, and run integration tests3. If the integration tests are successful deploy the application container to your production environment. and run acceptance tests

1. Install a Git hook to require developers to run unit tests before pushing the code to a central repositoryIf all tests are successful build a container2. Trigger Cloud Build to deploy the application container to a testing environment, and run integrationtests and acceptance tests3. If all tests are successful tag the code as production ready Trigger Cloud Build to build and deploy the application container to the production environment<

1. Trigger Cloud Build to build the application container and run unit tests with the container2. If unit tests are successful, deploy the application container to a testing environment, and run integration tests3. If the integration tests are successful the pipeline deploys the application container to the production environment After that, run acceptance tests

1. Trigger Cloud Build to run unit tests when the code is pushed If all unit tests are successful, build and push the application container to a central registry.2. Trigger Cloud Build to deploy the container to a testing environment, and run integration tests and acceptance tests3. If all tests are successful the pipeline deploys the application to the production environment and runs smoke tests

Question 4

You are the on-call Site Reliability Engineer for a microservice that is deployed to a Google Kubernetes Engine (GKE) Autopilot cluster. Your company runs an online store that publishes order messages to Pub/Sub and a microservice receives these messages and updates stock information in the warehousing system. A sales event caused an increase in orders, and the stock information is not being updated quickly enough. This is causing a large number of orders to be accepted for products that are out of stock You check the metrics for the microservice and compare them to typical levels.

Question # 4

You need to ensure that the warehouse system accurately reflects product inventory at the time orders are placed and minimize the impact on customers What should you do?

Options:

Decrease the acknowledgment deadline on the subscription

Add a virtual queue to the online store that allows typical traffic levels

Increase the number of Pod replicas

Increase the Pod CPU and memory limits

Question 5

You are part of an organization that follows SRE practices and principles. You are taking over the management of a new service from the Development Team, and you conduct a Production Readiness Review (PRR). After the PRR analysis phase, you determine that the service cannot currently meet its Service Level Objectives (SLOs). You want to ensure that the service can meet its SLOs in production. What should you do next?

Options:

Adjust the SLO targets to be achievable by the service so you can bring it into production.

Notify the development team that they will have to provide production support for the service.

Identify recommended reliability improvements to the service to be completed before handover.

Bring the service into production with no SLOs and build them when you have collected operational data.

Question 6

Your team is designing a new application for deployment both inside and outside Google Cloud Platform (GCP). You need to collect detailed metrics such as system resource utilization. You want to use centralized GCP services while minimizing the amount of work required to set up this collection system. What should you do?

Options:

Import the Stackdriver Profiler package, and configure it to relay function timing data to Stackdriver for further analysis.

Import the Stackdriver Debugger package, and configure the application to emit debug messages with timing information.

Instrument the code using a timing library, and publish the metrics via a health check endpoint that is scraped by Stackdriver.

Install an Application Performance Monitoring (APM) tool in both locations, and configure an export to a central data storage location for analysis.

Question 7

Your team of Infrastructure DevOps Engineers is growing, and you are starting to use Terraform to manage infrastructure. You need a way to implement code versioning and to share code with other team members. What should you do?

Options:

Store the Terraform code in a version-control system. Establish procedures for pushing new versions and merging with the master.

Store the Terraform code in a network shared folder with child folders for each version release. Ensure that everyone works on different files.

Store the Terraform code in a Cloud Storage bucket using object versioning. Give access to the bucket to every team member so they can download the files.

Store the Terraform code in a shared Google Drive folder so it syncs automatically to every team member’s computer. Organize files with a naming convention that identifies each new version.

Question 8

You are configuring your CI/CD pipeline natively on Google Cloud. You want builds in a pre-production Google Kubernetes Engine (GKE) environment to be automatically load-tested before being promoted to the production GKE environment. You need to ensure that only builds that have passed this test are deployed to production. You want to follow Google-recommended practices. How should you configure this pipeline with Binary Authorization?

Options:

Create an attestation for the builds that pass the load test by requiring the lead quality assurance engineer to sign the attestation by using a key stored in Cloud Key Management Service (Cloud KMS).

Create an attestation for the builds that pass the load test by using a private key stored in Cloud Key Management Service (Cloud KMS) authenticated through Workload Identity.

Create an attestation for the builds that pass the load test by using a private key stored in Cloud Key Management Service (Cloud KMS) with a service account JSON key stored as a Kubernetes Secret.

Create an attestation for the builds that pass the load test by requiring the lead quality assurance engineer to sign the attestation by using their personal private key.

Question 9

You use Google Cloud Managed Service for Prometheus with managed collection to gather metrics from your service running on Google Kubernetes Engine (GKE). After deploying the service, there is no metric data appearing in Cloud Monitoring, and you have not encountered any error messages. You need to troubleshoot this issue. What should you do?

Options:

Determine if your service has exceeded its quota for writes to the Cloud Monitoring API.

Check if the Grafana service is installed on your GKE cluster.

Confirm that your service has the monitoring.servicesViewer IAM role.

Verify that your PodMonitoring configuration references a valid port.

Question 10

Your company follows Site Reliability Engineering principles. You are writing a postmortem for an incident, triggered by a software change, that severely affected users. You want to prevent severe incidents from happening in the future. What should you do?

Options:

Identify engineers responsible for the incident and escalate to their senior management.

Ensure that test cases that catch errors of this type are run successfully before new software releases.

Follow up with the employees who reviewed the changes and prescribe practices they should follow in the future.

Design a policy that will require on-call teams to immediately call engineers and management to discuss a plan of action if an incident occurs.

Question 11

Your company is migrating its production systems to Google Cloud. You need to implement site reliability engineering (SRE) practices during the migration to minimize customer impact from potential future incidents. Which two SRE practices should you implement?

Choose 2 answers

Options:

Ensure that full autonomy and permissions are only granted to the on-call team.

Automate common tasks to analyze key impact information and intelligently suggest mitigating actions for the on-call team.

Ensure that all teams can modify the production environment to resolve issues.

Create an alerting mechanism for your SRE team based on your system's internal behavior.

Create up-to-date playbooks with instructions for debugging and mitigating issues.

Answer:

B, E

Explanation:

Comprehensive and Detailed Explanation From General SRE Principles and Google Cloud Knowledge:

Site Reliability Engineering (SRE) emphasizes reliability, automation, and a data-driven approach to operations. The goal is to minimize the "time to detect" (TTD) and "time to resolve" (TTR) for incidents.

Option A (Ensure that full autonomy and permissions are only granted to the on-call team): While the on-call team needs appropriate permissions to act decisively during an incident, granting full autonomy and only to them can be a bottleneck and goes against the principle of least privilege if not carefully scoped. Broader teams might need specific, controlled access for their responsibilities. SRE encourages empowering teams but within a structured framework.

Option B (Automate common tasks to analyze key impact information and intelligently suggest mitigating actions for the on-call team): This is a core SRE practice. Automation reduces toil, speeds up response, and ensures consistency. Analyzing impact and suggesting mitigations helps the on-call team resolve issues faster and more effectively.

Option C (Ensure that all teams can modify the production environment to resolve issues): This is generally a bad practice and against SRE principles of controlled changes and reducing the blast radius of errors. Production changes should be managed, audited, and ideally automated, not open to modification by all teams, as this increases the risk of unintended incidents.

Option D (Create an alerting mechanism for your SRE team based on your system's internal behavior): While alerting is crucial, SRE emphasizes alerting on symptoms that affect users (Service Level Objectives - SLOs) rather than just internal behavior or causes. Alerting solely on internal behavior can lead to alert fatigue and may not correlate directly with user impact. Good alerting focuses on user-facing impact first.

Option E (Create up-to-date playbooks with instructions for debugging and mitigating issues): Playbooks (or runbooks) are essential in SRE. They document known issues, troubleshooting steps, and mitigation procedures. Keeping them up-to-date ensures that on-call engineers can respond to incidents quickly and consistently, even for less common issues, thereby minimizing customer impact.

Therefore, automating incident response tasks (B) and maintaining clear, actionable playbooks (E) are two key SRE practices to implement for minimizing customer impact.

Reference (Based on SRE principles):

The SRE books by Google (e.g., "Site Reliability Engineering: How Google Runs Production Systems") heavily emphasize automation to reduce toil and the importance of playbooks for incident management.

Google Cloud SRE solutions: https://cloud.google.com/sre

Specifically, regarding playbooks and automation:"Playbooks should be living documents, updated regularly as systems change and new incidents provide new lessons."

"SREs aim to automate repetitive tasks (toil) to free up time for engineering projects that improve reliability."

Question 12

You are implementing a CI'CD pipeline for your application in your company s multi-cloud environment Your application is deployed by using custom Compute Engine images and the equivalent in other cloud providers You need to implement a solution that will enable you to build and deploy the images to your current environment and is adaptable to future changes Which solution stack should you use'?

Options:

Cloud Build with Packer

Cloud Build with Google Cloud Deploy

Google Kubernetes Engine with Google Cloud Deploy

Cloud Build with kpt

Answer:

Explanation:

Cloud Build is a fully managed continuous integration and continuous delivery (CI/CD) service that helps you automate your builds, tests, and deployments. Google Cloud Deploy is a service that automates the deployment of your applications to Google Kubernetes Engine (GKE).

Together, Cloud Build and Google Cloud Deploy can be used to build and deploy your application's custom Compute Engine images to your current environment and to other cloud providers in the future.

Here are the steps involved in using Cloud Build and Google Cloud Deploy to implement a CI/CD pipeline for your application:

Create a Cloud Build trigger that fires whenever a change is made to your application's code.

In the Cloud Build trigger, configure Cloud Build to build your application's Docker image.

Create a Google Cloud Deploy configuration file that specifies how to deploy your application's Docker image to GKE.

In Google Cloud Deploy, create a deployment that uses your configuration file.

Once you have created the Cloud Build trigger and Google Cloud Deploy configuration file, any changes made to your application's code will trigger Cloud Build to build a new Docker image. Google Cloud Deploy will then deploy the new Docker image to GKE.

This solution stack is adaptable to future changes because it uses a cloud-agnostic approach. Cloud Build can be used to build Docker images for any cloud provider, and Google Cloud Deploy can be used to deploy Docker images to any Kubernetes cluster.

The other solution stacks are not as adaptable to future changes. For example, solution stack A (Cloud Build with Packer) is limited to building Docker images for Compute Engine. Solution stack C (Google Kubernetes Engine with Google Cloud Deploy) is limited to deploying Docker images to GKE. Solution stack D (Cloud Build with kpt) is a newer solution that is not yet as mature as Cloud Build and Google Cloud Deploy.

Overall, the best solution stack for implementing a CI/CD pipeline for your application in a multi-cloud environment is Cloud Build with Google Cloud Deploy. This solution stack is fully managed, cloud-agnostic, and adaptable to future changes.

Question 13

You support an application running on GCP and want to configure SMS notifications to your team for the most critical alerts in Stackdriver Monitoring. You have already identified the alerting policies you want to configure this for. What should you do?

Options:

Download and configure a third-party integration between Stackdriver Monitoring and an SMS gateway. Ensure that your team members add their SMS/phone numbers to the external tool.

Select the Webhook notifications option for each alerting policy, and configure it to use a third-party integration tool. Ensure that your team members add their SMS/phone numbers to the external tool.

Ensure that your team members set their SMS/phone numbers in their Stackdriver Profile. Select the SMS notification option for each alerting policy and then select the appropriate SMS/phone numbers from the list.

Configure a Slack notification for each alerting policy. Set up a Slack-to-SMS integration to send SMS messages when Slack messages are received. Ensure that your team members add their SMS/phone numbers to the external integration.

Question 14

You work for a company that manages highly sensitive user data. You are designing the Google Kubernetes Engine (GKE) infrastructure for your company, including several applications that will be deployed in development and production environments. Your design must protect data from unauthorized access from other applications while minimizing the amount of management overhead required. What should you do?

Options:

Create one cluster for the organization with separate namespaces for each application and environment combination.

Create one cluster for each environment (development and production) with each application in its own namespace within each cluster.

Create one cluster for the organization with separate namespaces for each application.

Create one cluster for each application with separate namespaces for production and development environments.

Question 15

You need to deploy a new service to production. The service needs to automatically scale using a Managed Instance Group (MIG) and should be deployed over multiple regions. The service needs a large number of resources for each instance and you need to plan for capacity. What should you do?

Options:

Use the n2-highcpu-96 machine type in the configuration of the MIG.

Monitor results of Stackdriver Trace to determine the required amount of resources.

Validate that the resource requirements are within the available quota limits of each region.

Deploy the service in one region and use a global load balancer to route traffic to this region.

Question 16

Your company uses a CI/CD pipeline with Cloud Build and Artifact Registry to deploy container images to Google Kubernetes Engine (GKE). Images are tagged with the latest commit hash and promoted to production after successful testing in the development and pre-production environments. A recent production deployment caused the application to fail due to untested integration functionality, requiring a disruptive manual rollback. During the rollback, you noticed many old and unused container images accumulating in Artifact Registry. You need to improve rollout and rollback management and clean up the old container images. What should you do?

Options:

Adopt Cloud Deploy for managing deployments, and schedule a Cloud Build job for container image cleanup.

Deploy Cloud Service Mesh across the GKE clusters, and manually clean up Artifact Registry images.

Adopt Cloud Deploy for managing deployments, and implement an Artifact Registry cleanup policy.

Set up a rollback pipeline in Cloud Build, and implement an Artifact Registry cleanup policy.

Question 17

You are creating Cloud Logging sinks to export log entries from Cloud Logging to BigQuery for future analysis Your organization has a Google Cloud folder named Dev that contains development projects and a folder named Prod that contains production projects Log entries for development projects must be exported to dev_dataset. and log entries for production projects must be exported to prod_datasetYou need to minimize the number of log sinks created and you want to ensure that the log sinks apply to future projects What should you do?

Options:

Create a single aggregated log sink at the organization level.

Create a log sink in each project

Create two aggregated log sinks at the organization level, and filter by project ID

Create an aggregated Iog sink in the Dev and Prod folders

Question 18

You need to define SLOs for a high-traffic web application. Customers are currently happy with the application performance and availability. Based on current measurement, the 90th percentile Of latency is 160 ms and the 95th

percentile of latency is 300 ms over a 28-day window. What latency SLO should you publish?

Options:

90th percentile - 150 ms95th percentile - 290 ms

90th percentile - 160 ms95th percentile - 300 ms

90th percentile - 190 ms95th percentile - 330 ms

90th percentile - 300 ms95th percentile - 450 ms

Question 19

You need to create a Cloud Monitoring SLO for a service that will be published soon. You want to verify that requests to the service will be addressed in fewer than 300 ms at least 90% Of the time per calendar month. You need to identify the metric and evaluation method to use. What should you do?

Options:

Select a latency metric for a request-based method of evaluation.

Select a latency metric for a window-based method of evaluation.

Select an availability metric for a request-based method of evaluation.

Select an availability metric for a window-based method Of evaluation.

Question 20

You manage your company's primary revenue-generating application. You have an error budget policy in place that freezes production deployments when the application is close to breaching its SLO. A number of issues have recently occurred, and the application has exhausted its error budget. You need to deploy a new release to the application that includes a feature urgently required by your largest customer. You have been told that the release has passed all unit tests. What should you do?

Options:

Start the deployment of the feature immediately.

Delay the deployment of the feature until the error budget is replenished.

Re-run the unit tests, and start the deployment of the feature if the tests pass.

Deploy the feature to a subset of users, and gradually roll out to all users if there are no errors reported.

Answer:

Explanation:

Comprehensive and Detailed Explanation From SRE Principles:

This scenario presents a classic SRE conflict: maintaining reliability (as dictated by the exhausted error budget and deployment freeze) versus delivering an urgent business requirement. The error budget policy is there for a reason – to protect users from further instability.

A. Start the deployment of the feature immediately: This directly violates the established error budget policy and the deployment freeze. While the feature is urgent, deploying without caution when the system is already unstable (as indicated by the exhausted error budget) is highly risky and could exacerbate existing problems or introduce new ones, further impacting revenue and customer trust.

B. Delay the deployment of the feature until the error budget is replenished: This strictly adheres to the policy but might not be acceptable given the "urgently required by your largest customer" clause. SRE principles allow for reasoned exceptions and risk management, not just blind adherence if the business context is compelling enough and risks are managed.

C. Re-run the unit tests, and start the deployment of the feature if the tests pass: Unit tests are foundational but insufficient to guarantee a complex application will perform reliably in production, especially when the system is already indicating instability (exhausted error budget). Passing unit tests doesn't negate the risk signaled by the depleted error budget.

D. Deploy the feature to a subset of users, and gradually roll out to all users if there are no errors reported: This is the most balanced SRE approach in this situation. It acknowledges the urgency while attempting to mitigate risk:Risk Mitigation: A canary release (deploying to a small subset of users) limits the potential negative impact if the new feature introduces new errors or worsens existing instability.

Observation: It allows for careful monitoring of the new release in the production environment with real users.

Data-Driven Decision: The decision to proceed with a wider rollout is based on observed behavior ("if there are no errors reported"), not just assumptions.

Controlled Rollout: A gradual rollout allows for quick rollback if issues arise.

While an exhausted error budget signals a deployment freeze, critical business needs can sometimes necessitate a carefully managed exception. A canary release is a standard SRE technique for deploying changes with reduced risk, making it the most appropriate course of action when faced with such conflicting priorities. The team would also need to communicate clearly about the risks and the rationale for this exception. It's implied that this urgent feature might also fix existing issues or is critical enough to warrant the carefully managed risk.

Reference (Based on SRE principles from Google's SRE books and general practices):

Error Budgets: "The SRE Book" (Site Reliability Engineering: How Google Runs Production Systems) discusses error budgets and deployment freezes. An exhausted error budget typically means no more risky changes until reliability improves.

Canary Releases: This is a fundamental practice for safely deploying new versions. It's about testing in production with a small percentage of traffic.

Managing Risk: SRE is about managing risk, not eliminating it entirely. In situations like this, a calculated risk with strong mitigation (canary, monitoring, rollback plan) can be justified for critical business needs. The decision involves weighing the risk of deploying against the risk of not deploying the urgent feature.

Option D represents a pragmatic SRE approach to navigate this difficult situation by minimizing the blast radius of the change.

Question 21

You are deploying a Cloud Build job that deploys Terraform code when a Git branch is updated. While testing, you noticed that the job fails. You see the following error in the build logs:

Initializing the backend. ..

Error: Failed to get existing workspaces : querying Cloud Storage failed: googleapi : Error

403

You need to resolve the issue by following Google-recommended practices. What should you do?

Options:

Change the Terraform code to use local state.

Create a storage bucket with the name specified in the Terraform configuration.

Grant the roles/ owner Identity and Access Management (IAM) role to the Cloud Build service account on the project.

Grant the roles/ storage. objectAdmin Identity and Access Management (IAM) role to the Cloud Build service account on the state file bucket.

Answer:

Explanation:

The correct answer is D. Grant the roles/storage.objectAdmin Identity and Access Management (IAM) role to the Cloud Build service account on the state file bucket.

According to the Google Cloud documentation, Cloud Build is a service that executes your builds on Google Cloud Platform infrastructure1. Cloud Build uses a service account to execute your build steps and access resources, such as Cloud Storage buckets2. Terraform is an open-source tool that allows you to define and provision infrastructure as code3. Terraform uses a state file to store and track the state of your infrastructure4. You can configure Terraform to use a Cloud Storage bucket as a backend to store and share the state file across multiple users or environments5.

The error message indicates that Cloud Build failed to access the Cloud Storage bucket that contains the Terraform state file. This is likely because the Cloud Build service account does not have the necessary permissions to read and write objects in the bucket. To resolve this issue, you need to grant the roles/storage.objectAdmin IAM role to the Cloud Build service account on the state file bucket. This role allows the service account to create, delete, and manage objects in the bucket6. You can use the gcloud command-line tool or the Google Cloud Console to grant this role.

The other options are incorrect because they do not follow Google-recommended practices. Option A is incorrect because it changes the Terraform code to use local state, which is not recommended for production or collaborative environments, as it can cause conflicts, data loss, or inconsistency. Option B is incorrect because it creates a new storage bucket with the name specified in the Terraform configuration, but it does not grant any permissions to the Cloud Build service account on the new bucket. Option C is incorrect because it grants the roles/owner IAM role to the Cloud Build service account on the project, which is too broad and violates the principle of least privilege. The roles/owner role grants full access to all resources in the project, which can pose a security risk if misused or compromised.

[Reference:, Cloud Build Documentation, Overview. Service accounts, Service accounts. Terraform by HashiCorp, Terraform by HashiCorp. State, State. Google Cloud Storage Backend, Google Cloud Storage Backend. Predefined roles, Predefined roles. [Granting roles to service accounts for specific resources], Granting roles to service accounts for specific resources. [Local Backend], Local Backend. [Understanding roles], Understanding roles., , , , ]

Question 22

You work for a global organization and are running a monolithic application on Compute Engine You need to select the machine type for the application to use that optimizes CPU utilization by using the fewest number of steps You want to use historical system metncs to identify the machine type for the application to use You want to follow Google-recommended practices What should you do?

Options:

Use the Recommender API and apply the suggested recommendations

Create an Agent Policy to automatically install Ops Agent in all VMs

Install the Ops Agent in a fleet of VMs by using the gcloud CLI

Review the Cloud Monitoring dashboard for the VM and choose the machine type with the lowest CPU utilization

Answer:

Explanation:

The best option for selecting the machine type for the application to use that optimizes CPU utilization by using the fewest number of steps is to use the Recommender API and apply the suggested recommendations. The Recommender API is a service that provides recommendations for optimizing your Google Cloud resources, such as Compute Engine instances, disks, and firewalls. You can use the Recommender API to get recommendations for changing the machine type of your Compute Engine instances based on historical system metrics, such as CPU utilization. You can also apply the suggested recommendations by using the Recommender API or Cloud Console. This way, you can optimize CPU utilization by using the most suitable machine type for your application with minimal effort.

Your CTO has asked you to implement a postmortem policy on every incident for internal use. You want to define what a good postmortem is to ensure that the policy is successful at your company. What should you do?

Choose 2 answers

Ensure that all postmortems include what caused the incident, identify the person or team responsible for

causing the incident. and how to prevent a future occurrence of the incident.

Ensure that all postmortems include what caused the incident, how the incident could have been worse, and how to prevent a future occurrence of the incident.

Ensure that all postmortems include the severity of the incident, how to prevent a future occurrence of the incident. and what caused the incident without naming internal system components.

Ensure that all postmortems include how the incident was resolved and what caused the incident without naming customer information.

Ensure that all postmortems include all incident participants in postmortem authoring and share postmortems as widely as possible,

Answer: BE

The correct answers are B and E.

A good postmortem should include what caused the incident, how the incident could have been worse, and how to prevent a future occurrence of the incident1. This helps to identify the root cause of the problem, the impact of the incident, and the actions to take to mitigate or eliminate the risk of recurrence.

A good postmortem should also include all incident participants in postmortem authoring and share postmortems as widely as possible2. This helps to foster a culture of learning and collaboration, as well as to increase the visibility and accountability of the incident response process.

Answer A is incorrect because it assigns blame to a person or team, which goes against the principle of blameless postmortems2. Blameless postmortems focus on finding solutions rather than pointing fingers, and encourage honest and constructive feedback without fear of punishment.

Answer C is incorrect because it omits how the incident could have been worse, which is an important factor to consider when evaluating the severity and impact of the incident1. It also avoids naming internal system components, which makes it harder to understand the technical details and root cause of the problem.

Answer D is incorrect because it omits how to prevent a future occurrence of the incident, which is the main goal of a postmortem1. It also avoids naming customer information, which may be relevant for understanding the impact and scope of the incident.

Your uses Jenkins running on Google Cloud VM instances for CI/CD. You need to extend the functionality to use infrastructure as code automation by using Terraform. You must ensure that the Terraform Jenkins instance is authorized to create Google Cloud resources. You want to follow Google-recommended practices- What should you do?

Add the auth application-default command as a step in Jenkins before running the Terraform commands.

Create a dedicated service account for the Terraform instance. Download and copy the secret key value to the GOOGLE environment variable on the Jenkins server.

Confirm that the Jenkins VM instance has an attached service account with the appropriate Identity and Access Management (IAM) permissions.

use the Terraform module so that Secret Manager can retrieve credentials.

Answer: C

The correct answer is C.

Confirming that the Jenkins VM instance has an attached service account with the appropriate Identity and Access Management (IAM) permissions is the best way to ensure that the Terraform Jenkins instance is authorized to create Google Cloud resources.This follows the Google-recommended practice of using service accounts to authenticate and authorize applications running on Google Cloud1.Service accounts are associated with private keys that can be used to generate access tokens for Google Cloud APIs2.By attaching a service account to the Jenkins VM instance, Terraform can use the Application Default Credentials (ADC) strategy to automatically find and use the service account credentials3.

Answer A is incorrect because the auth application-default command is used to obtain user credentials, not service account credentials.User credentials are not recommended for applications running on Google Cloud, as they are less secure and less scalable than service account credentials1.

Answer B is incorrect because it involves downloading and copying the secret key value of the service account, which is not a secure or reliable way of managing credentials.The secret key value should be kept private and not exposed to any other system or user2. Moreover, setting the GOOGLE environment variable on the Jenkins server is not a valid way of providing credentials to Terraform.Terraform expects the credentials to be either in a file pointed by the GOOGLE_APPLICATION_CREDENTIALS environment variable, or in a provider block with the credentials argument3.

Answer D is incorrect because it involves using the Terraform module for Secret Manager, which is a service that stores and manages sensitive data such as API keys, passwords, and certificates. While Secret Manager can be used to store and retrieve credentials, it is not necessary or sufficient for authorizing the Terraform Jenkins instance. The Terraform Jenkins instance still needs a service account with the appropriate IAM permissions to access Secret Manager and other Google Cloud resources.

You are analyzing Java applications in production. All applications have Cloud Profiler and Cloud Trace installed and configured by default. You want to determine which applications need performance tuning. What should you do?

Choose 2 answers

A. Examine the wall-clock time and the CPU time Of the application. If the difference is substantial, increase the CPU resource allocation.

B. Examine the wall-clock time and the CPU time of the application. If the difference is substantial, increase the memory resource allocation.

C. 17 Examine the wall-clock time and the CPU time of the application. If the difference is substantial, increase the local disk storage allocation.

D. O Examine the latency time, the wall-clock time, and the CPU time of the application. If the latency time is slowly burning down the error budget, and the difference between wall-clock time and CPU time is minimal, mark the application for optimization.

E. Examine the heap usage Of the application. If the usage is low, mark the application for optimization.

Answer: AD

The correct answers are A and D.

Examine the wall-clock time and the CPU time of the application. If the difference is substantial, increase the CPU resource allocation. This is a good way to determine if the application is CPU-bound, meaning that it spends more time waiting for the CPU than performing actual computation.Increasing the CPU resource allocation can improve the performance of CPU-bound applications1.

Examine the latency time, the wall-clock time, and the CPU time of the application. If the latency time is slowly burning down the error budget, and the difference between wall-clock time and CPU time is minimal, mark the application for optimization. This is a good way to determine if the application is I/O-bound, meaning that it spends more time waiting for input/output operations than performing actual computation.Increasing the CPU resource allocation will not help I/O-bound applications, and they may need optimization to reduce the number or duration of I/O operations2.

Answer B is incorrect because increasing the memory resource allocation will not help if the application is CPU-bound or I/O-bound. Memory allocation affects how much data the application can store and access in memory, but it does not affect how fast the application can process that data.

Answer C is incorrect because increasing the local disk storage allocation will not help if the application is CPU-bound or I/O-bound. Disk storage affects how much data the application can store and access on disk, but it does not affect how fast the application can process that data.

Answer E is incorrect because examining the heap usage of the application will not help to determine if the application needs performance tuning. Heap usage affects how much memory the application allocates for dynamic objects, but it does not affect how fast the application can process those objects. Moreover, low heap usage does not necessarily mean that the application is inefficient or unoptimized.

You deployed an application into a large Standard Google Kubernetes Engine (GKE) cluster. The application is stateless and multiple pods run at the same time. Your application receives inconsistent traffic. You need to ensure that the user experience remains consistent regardless of changes in traffic. and that the resource usage of the cluster is optimized.

What should you do?

Configure a cron job to scale the deployment on a schedule.

Configure a Horizontal Pod Autoscaler.

Configure a Vertical Pod Autoscaler.

Configure cluster autoscaling on the node pool.

Answer: B

Question 23

You are designing a deployment technique for your applications on Google Cloud. As part Of your deployment planning, you want to use live traffic to gather performance metrics for new versions Ofyour applications. You need to test against the full production load before your applications are launched. What should you do?

Options:

Use A/B testing with blue/green deployment.

Use shadow testing with continuous deployment.

Use canary testing with continuous deployment.

Use canary testing with rolling updates deployment,

Question 24

You recently deployed your application in Google Kubernetes Engine (GKE) and now need to release a new version of the application You need the ability to instantly roll back to the previous version of the application in case there are issues with the new version Which deployment model should you use?

Options:

Perform a rolling deployment and test your new application after the deployment is complete

Perform A. B testing, and test your application periodically after the deployment is complete

Perform a canary deployment, and test your new application periodically after the new version is deployed

Perform a blue/green deployment and test your new application after the deployment is complete

Question 25

You recently migrated an ecommerce application to Google Cloud. You now need to prepare the application for the upcoming peak traffic season. You want to follow Google-recommended practices. What should you do first to prepare for the busy season?

Options:

Migrate the application to Cloud Run, and use autoscaling.

Load test the application to profile its performance for scaling.

Create a Terraform configuration for the application's underlying infrastructure to quickly deploy to additional regions.

Pre-provision the additional compute power that was used last season, and expect growth.

Answer:

Explanation:

The first thing you should do to prepare your ecommerce application for the upcoming peak traffic season is to load test the application to profile its performance for scaling. Load testing is a process of simulating high traffic or user demand on your application and measuring how it responds.Load testing can help you identify any bottlenecks, errors, or performance issues that might affect your application during the busy season1.Load testing can also help you determine the optimal scaling strategy for your application, such as horizontal scaling (adding more instances) or vertical scaling (adding more resources to each instance)2.

There are different tools and methods for load testing your ecommerce application on Google Cloud, depending on the type and complexity of your application.For example, you can use Cloud Load Balancing to distribute traffic across multiple instances of your application, and use Cloud Monitoring to measure the latency, throughput, and error rate of your application3.You can also use Cloud Functions or Cloud Run to create serverless load generators that can simulate user requests and send them to your application4. Alternatively, you can use third-party tools such as Apache JMeter or Locust to create and run load tests on your application.

By load testing your ecommerce application before the peak traffic season, you can ensure that your application is ready to handle the expected load and provide a good user experience. You can also use the results of your load tests to plan and implement other steps to prepare your application for the busy season, such as migrating to a more scalable platform, creating a Terraform configuration for deploying to additional regions, or pre-provisioning additional compute power.

Question 26

Your applications performance in Google Cloud has degraded since the last release You suspect that downstream dependencies might be causing some requests to take longer to complete You need to investigate the issue with your application to determine the cause What should you do?

Options:

Configure Error Reporting in your application

Configure Google Cloud Managed Service for Prometheus in your application

Configure Cloud Profiler in your application

Configure Cloud Trace in your application

Question 27

You are running a web application deployed to a Compute Engine managed instance group Ops Agent is installed on all instances You recently noticed suspicious activity from a specific IP address You need to configure Cloud Monitoring to view the number of requests from that specific IP address with minimal operational overhead. What should you do?

Options:

Configure the Ops Agent with a logging receiver Create a logs-based metric

Create a script to scrape the web server log Export the IP address request metrics to the Cloud Monitoring API

Update the application to export the IP address request metrics to the Cloud Monitoring API

Configure the Ops Agent with a metrics receiver

Question 28

You use Spinnaker to deploy your application and have created a canary deployment stage in the pipeline. Your application has an in-memory cache that loads objects at start time. You want to automate the comparison of the canary version against the production version. How should you configure the canary analysis?

Options:

Compare the canary with a new deployment of the current production version.

Compare the canary with a new deployment of the previous production version.

Compare the canary with the existing deployment of the current production version.

Compare the canary with the average performance of a sliding window of previous production versions.

Question 29

Your company runs applications in Google Kubernetes Engine (GKE). Several applications rely on ephemeral volumes. You noticed some applications were unstable due to the DiskPressure node condition on the worker nodes. You need

to identify which Pods are causing the issue, but you do not have execute access to workloads and nodes. What should you do?

Options:

Check the node/ephemeral_storage/used_bytes metric by using Metrics Explorer.

Check the metric by using Metrics Explorer.

Locate all the Pods with emptyDir volumes. use the df-h command to measure volume disk usage.

Locate all the Pods with emptyDir volumes. Use the du -sh * command to measure volume disk usage.

Question 30

You have deployed a fleet Of Compute Engine instances in Google Cloud. You need to ensure that monitoring metrics and logs for the instances are visible in Cloud Logging and Cloud Monitoring by your company's operations and cyber

security teams. You need to grant the required roles for the Compute Engine service account by using Identity and Access Management (IAM) while following the principle of least privilege. What should you do?

Options:

Grant the logging.editor and monitoring.metricwriter roles to the Compute Engine service accounts.

Grant the Logging. admin and monitoring . editor roles to the Compute Engine service accounts.

Grant the logging. logwriter and monitoring. editor roles to the Compute Engine service accounts.

Grant the logging. logWriter and monitoring. metricWriter roles to the Compute Engine service accounts.

Answer:

Explanation:

The correct answer is D. Grant the logging.logWriter and monitoring.metricWriter roles to the Compute Engine service accounts.

According to the Google Cloud documentation, the Compute Engine service account is a Google-managed service account that is automatically created when you enable the Compute Engine API1.This service account is used by default to run your Compute Engine instances and access other Google Cloud services on your behalf1.To ensure that monitoring metrics and logs for the instances are visible in Cloud Logging and Cloud Monitoring, you need to grant the following IAM roles to the Compute Engine service account23:

The logging.logWriter role allows the service account to write log entries to Cloud Logging4.

The monitoring.metricWriter role allows the service account to write custom metrics to Cloud Monitoring5.

These roles grant the minimum permissions that are needed for logging and monitoring, following the principle of least privilege. The other roles are either unnecessary or too broad for this purpose.For example, the logging.editor role grants permissions to create and update logs, log sinks, and log exclusions, which are not required for writing log entries6. The logging.admin role grants permissions to delete logs, log sinks, and log exclusions, which are not required for writing log entries and may pose a security risk if misused. The monitoring.editor role grants permissions to create and update alerting policies, uptime checks, notification channels, dashboards, and groups, which are not required for writing custom metrics.

[Reference:, Service accounts, Service accounts.Setting up Stackdriver Logging for Compute Engine, Setting up Stackdriver Logging for Compute Engine.Setting up Stackdriver Monitoring for Compute Engine, Setting up Stackdriver Monitoring for Compute Engine.Predefined roles, Predefined roles.Predefined roles, Predefined roles.Predefined roles, Predefined roles. [Predefined roles], Predefined roles. [Predefined roles], Predefined roles., , , , , ]

Question 31

You support a popular mobile game application deployed on Google Kubernetes Engine (GKE) across several Google Cloud regions. Each region has multiple Kubernetes clusters. You receive a report that none of the users in a specific region can connect to the application. You want to resolve the incident while following Site Reliability Engineering practices. What should you do first?

Options:

Reroute the user traffic from the affected region to other regions that don’t report issues.

Use Stackdriver Monitoring to check for a spike in CPU or memory usage for the affected region.

Add an extra node pool that consists of high memory and high CPU machine type instances to the cluster.

Use Stackdriver Logging to filter on the clusters in the affected region, and inspect error messages in the logs.

Question 32

Your organization is running multiple Google Kubernetes Engine (GKE) clusters in a project. You need to design a highly-available solution to collect and query both domain-specific workload metrics and GKE default metrics across all clusters, while minimizing operational overhead. What should you do?

Options:

Use Prometheus Operator to install Prometheus in every cluster and scrape the metrics. Ensure that a Thanos sidecar is enabled on every Prometheus instance. Configure Thanos in the central cluster. Query the central Thanos instance.

Use Prometheus Operator to install Prometheus in every cluster and scrape the metrics. Configure remote-write to one central Prometheus. Query the central Prometheus instance.

Enable managed collection on every GKE cluster. Query the metrics in Cloud Monitoring.

Enable managed collection on every GKE cluster. Query the metrics in BigQuery.

Question 33

You are running an application on Compute Engine and collecting logs through Stackdriver. You discover that some personally identifiable information (PII) is leaking into certain log entry fields. You want to prevent these fields from being written in new log entries as quickly as possible. What should you do?

Options:

Use the filter-record-transformer Fluentd filter plugin to remove the fields from the log entries in flight.

Use the fluent-plugin-record-reformer Fluentd output plugin to remove the fields from the log entries in flight.

Wait for the application developers to patch the application, and then verify that the log entries are no longer exposing PII.

Stage log entries to Cloud Storage, and then trigger a Cloud Function to remove the fields and write the entries to Stackdriver via the Stackdriver Logging API.

Question 34

Your company follows Site Reliability Engineering practices. You are the Incident Commander for a new. customer-impacting incident. You need to immediately assign two incident management roles to assist you in an effective incident response. What roles should you assign?

Choose 2 answers

Options:

Operations Lead

Engineering Lead

Communications Lead

Customer Impact Assessor

External Customer Communications Lead

Question 35

Your company is using HTTPS requests to trigger a public Cloud Run-hosted service accessible at the https://booking-engine-abcdef .a.run.app URL You need to give developers the ability to test the latest revisions of the service before the service is exposed to customers What should you do?

Options:

Runthegcioud run deploy booking-engine —no-traffic —-ag dev command Use the https://dev----booking-engine-abcdef. a. run. app URL for testing

Runthegcioud run services update-traffic booking-engine —to-revisions LATEST*! command Use the ht tps: //booking-engine-abcdef. a. run. ape URL for testing

Pass the curl -K "Authorization: Hearer S(gclcud auth print-identity-token)" auth token Use the https: / /booking-engine-abcdef. a. run. app URL to test privately

Grant the roles/run. invoker role to the developers testing the booking-engine service Use the https: //booking-engine-abcdef. private. run. app URL for testing

Question 36

Your company has recently experienced several production service issues. You need to create a Cloud Monitoring dashboard to troubleshoot the issues, and you want to use the dashboard to distinguish between failures in your own service and those caused by a Google Cloud service that you use. What should you do?

Options:

Enable Personalized Service Health annotations on the dashboard.

Create an alerting policy for the system error metrics.

Create a log-based metric to track cloud service errors, and display the metric on the dashboard.

Create a logs widget to display system errors from Cloud Logging on the dashboard.

Answer:

Explanation:

Comprehensive and Detailed Explanation From General Cloud Monitoring Knowledge:

The key requirement is to distinguish between failures in your own service and those caused by an underlying Google Cloud service.

A. Enable Personalized Service Health annotations on the dashboard: Google Cloud Personalized Service Health provides information about incidents affecting Google Cloud services that may impact your projects. When enabled and integrated with Monitoring, it can display these events as annotations on your dashboards, overlaying them on your service's metrics charts. This allows you to correlate dips in your service's performance with known Google Cloud service issues, directly addressing the need to distinguish failure origins.

B. Create an alerting policy for the system error metrics: Alerting policies are for notifications when metrics cross thresholds. While useful for detecting issues in your own service, they don't inherently distinguish the cause between your service and a Google Cloud dependency without further context, which option A provides.

C. Create a log-based metric to track cloud service errors, and display the metric on the dashboard: You could try to create log-based metrics from logs that might indicate a cloud service error (e.g., specific API error codes from Google Cloud services). However, this is indirect, might require complex parsing, and Personalized Service Health is a more direct and authoritative source for Google Cloud service disruptions.

D. Create a logs widget to display system errors from Cloud Logging on the dashboard: Similar to C, displaying raw system error logs can be helpful for troubleshooting your own service, but it doesn't provide a clear, curated view of whether a Google Cloud service itself is having an issue. It would require manual interpretation to link these logs to a potential Google Cloud outage.

Personalized Service Health is specifically designed to provide visibility into Google Cloud service incidents relevant to your resources. Integrating this with Monitoring dashboards is the most direct way to achieve the stated goal.

Reference (Based on Cloud Monitoring and Personalized Service Health features):

Personalized Service Health Overview: https://cloud.google.com/service-health/docs/overview

Integrating with Cloud Monitoring: Documentation often shows how to enable annotations for Personalized Service Health events on Monitoring charts. This allows a visual correlation between your service metrics and Google Cloud service health events."Personalized Service Health integrates with Cloud Monitoring so you can see service health events alongside your metrics."

"You can enable annotations on your metric charts to display relevant Personalized Service Health events."

This feature directly helps differentiate between issues in your application versus issues in the underlying Google Cloud services.

Question 37

Your organization is starting to containerize with Google Cloud. You need a fully managed storage solution for container images and Helm charts. You need to identify a storage solution that has native integration into existing Google Cloud services, including Google Kubernetes Engine (GKE), Cloud Run, VPC Service Controls, and Identity and Access Management (IAM). What should you do?

Options:

Use Docker to configure a Cloud Storage driver pointed at the bucket owned by your organization.

Configure Container Registry as an OCI-based container registry for container images.

Configure Artifact Registry as an OCI-based container registry for both Helm charts and container images.

Configure an open source container registry server to run in GKE with a restrictive role-based access control (RBAC) configuration.

Question 38

You recently configured an App Hub application. You are able to see the managed instance group, backend service, and URL map listed in App Hub, but you do not see the forwarding rule. You must ensure that the forwarding rule is listed. What should you do?

Options:

Attach the project containing the forwarding rule as an App Hub service project.

Enable the App Hub API in the project containing the forwarding rule.

Configure the forwarding rule to forward to the correct target proxy.

Answer:

Explanation:

Comprehensive and Detailed Explanation From General Google Cloud Knowledge:

App Hub allows you to organize and discover services and applications within your Google Cloud environment. For App Hub to recognize and display resources as components of an "application," these resources often need to be explicitly registered or discovered as "services" within that application's configuration. While App Hub can automatically discover some resources (like GKE workloads, Cloud Run services), for other resources, or to establish specific relationships, manual registration or more detailed configuration is sometimes required.

Option A (Attach the project containing the forwarding rule as an App Hub service project): While App Hub works across projects (host project for the application, service projects for services and workloads), simply attaching the project might not be sufficient for App Hub to automatically pick up and categorize every resource like a forwarding rule specifically for a defined application without further context. The forwarding rule needs to be associated with a service within the App Hub application.

Option B (Enable the App Hub API in the project containing the forwarding rule): The App Hub API needs to be enabled in projects where you want to manage App Hub resources (applications, services, workloads). If it wasn't enabled, you likely wouldn't be able to see any resources from that project. Since other resources are visible, this is less likely the root cause for a single missing resource, though it's a prerequisite for App Hub to function at all with that project.

Option C (Configure the forwarding rule to forward to the correct target proxy): While correct configuration of the forwarding rule is essential for its operational functionality, App Hub's ability to list the forwarding rule is more about its discovery and registration within App Hub's model rather than its traffic-directing correctness. An incorrectly configured forwarding rule that is properly registered might still appear in App Hub, perhaps with an error status.

Option D (Register the forwarding rule as a service in the application configuration): App Hub applications are composed of "services," and these services are in turn composed of "workloads" or other discovered/registered resources. A forwarding rule is typically an entry point or part of the infrastructure for a service. Explicitly registering it or the resource it points to (which then allows App Hub to trace back to the forwarding rule) as a service or part of a service within the application configuration would make it visible and properly cataloged by App Hub. App Hub discovers resources by looking for specific labels or by manual registration. If it's not automatically discovered as part of a recognized workload (like a GCE instance group service exposed via a load balancer), explicit registration is often the way to make it appear.

Reference (Based on general App Hub functionality):

App Hub discovers resources that are part of registered applications and their services. Services in App Hub can be based on various Google Cloud resources. If a resource like a forwarding rule isn't automatically linked to a displayed workload, it might need to be explicitly defined as a service or part of a service.

From the Google Cloud documentation on App Hub concepts:

"Applications are the core organizational unit in App Hub. An application represents a logical system that delivers business value... Services represent the logical components of an application... Workloads are instances of your services running on Google Cloud infrastructure. App Hub automatically discovers workloads for supported resource types or you can manually register them."

Forwarding rules are associated with load balancing, which exposes services. If the service that the forwarding rule points to is correctly registered and identified by App Hub, associated infrastructure like the forwarding rule should typically be discoverable. If it's not, ensuring the service it fronts is correctly registered and that App Hub understands this link is key. Option D aligns with this concept of ensuring the relevant component (which the forwarding rule is part of) is registered within the application structure.

You can find more information in the official Google Cloud documentation regarding App Hub:

App Hub overview: https://cloud.google.com/app-hub/docs/overview

Registering services and workloads: Documentation would detail how different resources are discovered or need to be registered.

Question 39

You support a web application that runs on App Engine and uses CloudSQL and Cloud Storage for data storage. After a short spike in website traffic, you notice a big increase in latency for all user requests, increase in CPU use, and the number of processes running the application. Initial troubleshooting reveals:

After the initial spike in traffic, load levels returned to normal but users still experience high latency.

Requests for content from the CloudSQL database and images from Cloud Storage show the same high latency.

No changes were made to the website around the time the latency increased.

There is no increase in the number of errors to the users.

You expect another spike in website traffic in the coming days and want to make sure users don’t experience latency. What should you do?

Options:

Upgrade the GCS buckets to Multi-Regional.

Enable high availability on the CloudSQL instances.

Move the application from App Engine to Compute Engine.

Modify the App Engine configuration to have additional idle instances.

Question 40

You are working with a government agency that requires you to archive application logs for seven years. You need to configure Stackdriver to export and store the logs while minimizing costs of storage. What should you do?

Options:

Create a Cloud Storage bucket and develop your application to send logs directly to the bucket.

Develop an App Engine application that pulls the logs from Stackdriver and saves them in BigQuery.

Create an export in Stackdriver and configure Cloud Pub/Sub to store logs in permanent storage for seven years.

Create a sink in Stackdriver, name it, create a bucket on Cloud Storage for storing archived logs, and then select the bucket as the log export destination.

Question 41

You are deploying an application that needs to access sensitive information. You need to ensure that this information is encrypted and the risk of exposure is minimal if a breach occurs. What should you do?

Options:

Store the encryption keys in Cloud Key Management Service (KMS) and rotate the keys frequently

Inject the secret at the time of instance creation via an encrypted configuration management system.

Integrate the application with a Single sign-on (SSO) system and do not expose secrets to the application

Leverage a continuous build pipeline that produces multiple versions of the secret for each instance of the application.

Question 42

You are building and running client applications in Cloud Run and Cloud Functions Your client requires that all logs must be available for one year so that the client can import the logs into their logging service You must minimize required code changes What should you do?

Options:

Update all images in Cloud Run and all functions in Cloud Functions to send logs to both Cloud Logging andthe client's logging service Ensure that all the ports required to send logs are open in the VPC firewall

Create a Pub/Sub topic subscription and logging sink Configure the logging sink to send all logs into thetopic Give your client access to the topic to retrieve the logs

Create a storage bucket and appropriate VPC firewall rules Update all images in Cloud Run and allfunctions in Cloud Functions to send logs to a file within the storage bucket

Create a logs bucket and logging sink. Set the retention on the logs bucket to 365 days Configure thelogging sink to send logs to the bucket Give your client access to the bucket to retrieve the logs

Question 43

You are using Jenkins to orchestrate your CI/CD pipelines deploying your applications to GKE and on-premises VMs. After expanding your on-premises infrastructure, all resource utilization is normal, but you notice that initiating on-premises deployments is taking longer than expected. You need to accelerate the initiation of on-premises deployments. What should you do?

Options:

Implement a blue-green deployment strategy for your on-premises VMs.

Upgrade your on-premises VMs to have faster CPUs and more memory.

Increase the number of Jenkins executor threads.

Configure Jenkins to use an artifact repository in your on-premises environment.

Question 44

You need to introduce postmortems into your organization during the holiday shopping season. You are expecting your web application to receive a large volume of traffic in a short period. You need to prepare your application for potential failures during the event What should you do?

Choose 2 answers

Options:

Monitor latency of your services for average percentile latency.

Review your increased capacity requirements and plan for the required quota management.

Create alerts in Cloud Monitoring for all common failures that your application experiences.

Ensure that relevant system metrics are being captured with Cloud Monitoring and create alerts at levels of interest.

Configure Anthos Service Mesh on the application to identify issues on the topology map.

Question 45

You created a Stackdriver chart for CPU utilization in a dashboard within your workspace project. You want to share the chart with your Site Reliability Engineering (SRE) team only. You want to ensure you follow the principle of least privilege. What should you do?

Options:

Share the workspace Project ID with the SRE team. Assign the SRE team the Monitoring Viewer IAM role in the workspace project.

Share the workspace Project ID with the SRE team. Assign the SRE team the Dashboard Viewer IAM role in the workspace project.

Click "Share chart by URL" and provide the URL to the SRE team. Assign the SRE team the Monitoring Viewer IAM role in the workspace project.

Click "Share chart by URL" and provide the URL to the SRE team. Assign the SRE team the Dashboard Viewer IAM role in the workspace project.

Question 46

You are leading a DevOps project for your organization. The DevOps team is responsible for managing the service infrastructure and being on-call for incidents. The Software Development team is responsible for writing, submitting, and reviewing code. Neither team has any published SLOs. You want to design a new joint-ownership model for a service between the DevOps team and the Software Development team. Which responsibilities should be assigned to each team in the new joint-ownership model?

Question # 46

Options:

Option A

Option B

Option C

Option D

Answer:

Explanation:

The correct answer is D. Option D.

According to the DevOps best practices, a joint-ownership model for a service between the DevOps team and the Software Development team should follow these principles12:

The DevOps team and the Software Development team should share the responsibility and collaboration for managing the service infrastructure, performing code reviews, and adopting and sharing SLOs for the service.

The DevOps team and the Software Development team should have end-to-end ownership of the service, from design to development to deployment to operation to maintenance.

The DevOps team and the Software Development team should use common tools and processes to facilitate communication, coordination, and feedback.

The DevOps team and the Software Development team should align their goals and incentives with the business outcomes and customer satisfaction.

Option D is the only option that reflects these principles. Option D assigns both teams the responsibilities of managing the service infrastructure, performing code reviews, and adopting and sharing SLOs for the service. Option D also implies that both teams have end-to-end ownership of the service, as they are involved in every stage of the service lifecycle.Option D also encourages both teams to use common tools and processes, such as GitLab3, to collaborate and communicate effectively. Option D also aligns both teams with the business outcomes and customer satisfaction, as they use SLOs to measure and improve the service quality.

The other options are incorrect because they do not follow the DevOps best practices. Option A is incorrect because it assigns only the DevOps team the responsibility of managing the service infrastructure, which creates a silo between the two teams and reduces their collaboration. Option A also does not assign any responsibility for adopting and sharing SLOs for the service, which means that both teams lack a common metric for measuring and improving the service quality. Option B is incorrect because it assigns only the Software Development team the responsibility of performing code reviews, which creates a gap between the two teams and reduces their feedback. Option B also does not assign any responsibility for adopting and sharing SLOs for the service, which means that both teams lack a common metric for measuring and improving the service quality. Option C is incorrect because it assigns both teams the same responsibilities as option A and option B, which combines their drawbacks.

[Reference:, 5 key organizational models for DevOps teams | GitLab, 5 key organizational models for DevOps teams | GitLab.Building a Culture of Full-Service Ownership - DevOps.com, Building a Culture of Full-Service Ownership - DevOps.com.GitLab, GitLab., , , , , ]

Question 47

You are currently planning how to display Cloud Monitoring metrics for your organization's Google Cloud projects. Your organization has three folders and six projects:

Question # 47

You want to configure Cloud Monitoring dashboards lo only display metrics from the projects within one folder You need to ensure that the dashboards do not display metrics from projects in the other folders You want to follow Google-recommended practices What should you do?

Options:

Create a single new scoping project

Create new scoping projects for each folder

Use the current app-one-prod project as the scoping project

Use the current app-one-dev, app-one-staging and app-one-prod projects as the scoping project for each folder

Question 48

You are using Stackdriver to monitor applications hosted on Google Cloud Platform (GCP). You recently deployed a new application, but its logs are not appearing on the Stackdriver dashboard.

You need to troubleshoot the issue. What should you do?

Options:

Confirm that the Stackdriver agent has been installed in the hosting virtual machine.

Confirm that your account has the proper permissions to use the Stackdriver dashboard.

Confirm that port 25 has been opened in the firewall to allow messages through to Stackdriver.

Confirm that the application is using the required client library and the service account key has proper permissions.

Question 49

Your organization recently adopted a container-based workflow for application development. Your team develops numerous applications that are deployed continuously through an automated build pipeline to the production environment. A recent security audit alerted your team that the code pushed to production could contain vulnerabilities and that the existing tooling around virtual machine (VM) vulnerabilities no longer applies to the containerized environment. You need to ensure the security and patch level of all code running through the pipeline. What should you do?

Options:

Set up Container Analysis to scan and report Common Vulnerabilities and Exposures.

Configure the containers in the build pipeline to always update themselves before release.

Reconfigure the existing operating system vulnerability software to exist inside the container.

Implement static code analysis tooling against the Docker files used to create the containers.

Question 50

You recently created a Cloud Build pipeline for deploying Terraform code stored in a GitHub repository. You make Terraform code changes in short-lived branches and sometimes use tags during development. You tag releases with a semantic version when they are ready for deployment. You require your pipeline to apply the Terraform code whenever there is a new release, and you need to minimize operational overhead. What should you do?

Options:

Create a build trigger with the * branch pattern.

Create a build trigger with the \d+\.\d+\.\d* tag pattern.

Create a build trigger with the .* tag pattern.

Create a build trigger with the \d*\.\d+\.\d* branch pattern.

Question 51

Your company follows Site Reliability Engineering practices. You are the person in charge of Communications for a large, ongoing incident affecting your customer-facing applications. There is still no estimated time for a resolution of the outage. You are receiving emails from internal stakeholders who want updates on the outage, as well as emails from customers who want to know what is happening. You want to efficiently provide updates to everyone affected by the outage. What should you do?

Options:

Focus on responding to internal stakeholders at least every 30 minutes. Commit to "next update" times.

Provide periodic updates to all stakeholders in a timely manner. Commit to a "next update" time in all communications.

Delegate the responding to internal stakeholder emails to another member of the Incident Response Team. Focus on providing responses directly to customers.

Provide all internal stakeholder emails to the Incident Commander, and allow them to manage internal communications. Focus on providing responses directly to customers.

Question 52

You are building an application that runs on Cloud Run The application needs to access a third-party API by using an API key You need to determine a secure way to store and use the API key in your application by following Google-recommended practices What should you do?

Options:

Save the API key in Secret Manager as a secret Reference the secret as an environment variable in the Cloud Run application

Save the API key in Secret Manager as a secret key Mount the secret key under the /sys/api_key directory and decrypt the key in the Cloud Run application

Save the API key in Cloud Key Management Service (Cloud KMS) as a key Reference the key as an environment variable in the Cloud Run application

Encrypt the API key by using Cloud Key Management Service (Cloud KMS) and pass the key to Cloud Run as an environment variable Decrypt and use the key in Cloud Run

Question 53

Your application services run in Google Kubernetes Engine (GKE). You want to make sure that only images from your centrally-managed Google Container Registry (GCR) image registry in the altostrat-images project can be deployed to the cluster while minimizing development time. What should you do?

Options:

Create a custom builder for Cloud Build that will only push images to gcr.io/altostrat-images.

Use a Binary Authorization policy that includes the whitelist name pattern gcr.io/attostrat-images/.

Add logic to the deployment pipeline to check that all manifests contain only images from gcr.io/altostrat-images.

Add a tag to each image in gcr.io/altostrat-images and check that this tag is present when the image is deployed.

Question 54

You are managing an application that runs in Compute Engine The application uses a custom HTTP server to expose an API that is accessed by other applications through an internal TCP/UDP load balancer A firewall rule allows access to the API port from 0.0.0-0/0. You need to configure Cloud Logging to log each IP address that accesses the API by using the fewest number of steps What should you do Bret?

Options:

Enable Packet Mirroring on the VPC

Install the Ops Agent on the Compute Engine instances.

Enable logging on the firewall rule

Enable VPC Flow Logs on the subnet

Question 55

You use a multiple step Cloud Build pipeline to build and deploy your application to Google Kubernetes Engine (GKE). You want to integrate with a third-party monitoring platform by performing a HTTP POST of the build information to a webhook. You want to minimize the development effort. What should you do?

Options:

Add logic to each Cloud Build step to HTTP POST the build information to a webhook.

Add a new step at the end of the pipeline in Cloud Build to HTTP POST the build information to a webhook.

Use Stackdriver Logging to create a logs-based metric from the Cloud Buitd logs. Create an Alert with a Webhook notification type.

Create a Cloud Pub/Sub push subscription to the Cloud Build cloud-builds PubSub topic to HTTP POST the build information to a webhook.

Question 56

Your company runs services by using multiple globally distributed Google Kubernetes Engine (GKE) clusters Your operations team has set up workload monitoring that uses Prometheus-based tooling for metrics alerts: and generating dashboards This setup does not provide a method to view metrics globally across all clusters You need to implement a scalable solution to support global Prometheus querying and minimize management overhead What should you do?

Options:

Configure Prometheus cross-service federation for centralized data access

Configure workload metrics within Cloud Operations for GKE

Configure Prometheus hierarchical federation for centralized data access

Configure Google Cloud Managed Service for Prometheus

Question 57

You support an application running on App Engine. The application is used globally and accessed from various device types. You want to know the number of connections. You are using Stackdriver Monitoring for App Engine. What metric should you use?

Options:

flex/connections/current

tcp_ssl_proxy/new_connections

tcp_ssl_proxy/open_connections

flex/instance/connections/current

Question 58

You are running an application on Compute Engine and collecting logs through Stackdriver. You discover that some personally identifiable information (Pll) is leaking into certain log entry fields. All Pll entries begin with the text userinfo. You want to capture these log entries in a secure location for later review and prevent them from leaking to Stackdriver Logging. What should you do?

Options:

Create a basic log filter matching userinfo, and then configure a log export in the Stackdriver console with Cloud Storage as a sink.

Use a Fluentd filter plugin with the Stackdriver Agent to remove log entries containing userinfo, and then copy the entries to a Cloud Storage bucket.

Create an advanced log filter matching userinfo, configure a log export in the Stackdriver console with Cloud Storage as a sink, and then configure a tog exclusion with userinfo as a filter.

Use a Fluentd filter plugin with the Stackdriver Agent to remove log entries containing userinfo, create an advanced log filter matching userinfo, and then configure a log export in the Stackdriver console with Cloud Storage as a sink.

Load More Professional-Cloud-DevOps-Engineer Questions

Weekend Sale Limited Time Flat 70% Discount offer - Ends in 0d 00h 00m 00s - Coupon code: 70spcl

Activedumpsnet Logo

Activedumpsnet Navigation

Activedumpsnet Slider

Google Professional-Cloud-DevOps-Engineer Google Cloud Certified - Professional Cloud DevOps Engineer Exam Exam Practice Test

Google Cloud Certified - Professional Cloud DevOps Engineer Exam Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options: