Spring Sale Limited Time Flat 70% Discount offer - Ends in 0d 00h 00m 00s - Coupon code: 70spcl

Databricks Databricks-Generative-AI-Engineer-Associate Databricks Certified Generative AI Engineer Associate Exam Practice Test

Databricks Certified Generative AI Engineer Associate Questions and Answers

Question 1

A Generative Al Engineer is creating an LLM system that will retrieve news articles from the year 1918 and related to a user's query and summarize them. The engineer has noticed that the summaries are generated well but often also include an explanation of how the summary was generated, which is undesirable.

Which change could the Generative Al Engineer perform to mitigate this issue?

Options:

A.

Split the LLM output by newline characters to truncate away the summarization explanation.

B.

Tune the chunk size of news articles or experiment with different embedding models.

C.

Revisit their document ingestion logic, ensuring that the news articles are being ingested properly.

D.

Provide few shot examples of desired output format to the system and/or user prompt.

Question 2

A small and cost-conscious startup in the cancer research field wants to build a RAG application using Foundation Model APIs.

Which strategy would allow the startup to build a good-quality RAG application while being cost-conscious and able to cater to customer needs?

Options:

A.

Limit the number of relevant documents available for the RAG application to retrieve from

B.

Pick a smaller LLM that is domain-specific

C.

Limit the number of queries a customer can send per day

D.

Use the largest LLM possible because that gives the best performance for any general queries

Question 3

A Generative AI Engineer wants to build an LLM-based solution to help a restaurant improve its online customer experience with bookings by automatically handling common customer inquiries. The goal of the solution is to minimize escalations to human intervention and phone calls while maintaining a personalized interaction. To design the solution, the Generative AI Engineer needs to define the input data to the LLM and the task it should perform.

Which input/output pair will support their goal?

Options:

A.

Input: Online chat logs; Output: Group the chat logs by users, followed by summarizing each user’s interactions

B.

Input: Online chat logs; Output: Buttons that represent choices for booking details

C.

Input: Customer reviews; Output: Classify review sentiment

D.

Input: Online chat logs; Output: Cancellation options

Question 4

What is an effective method to preprocess prompts using custom code before sending them to an LLM?

Options:

A.

Directly modify the LLM’s internal architecture to include preprocessing steps

B.

It is better not to introduce custom code to preprocess prompts as the LLM has not been trained with examples of the preprocessed prompts

C.

Rather than preprocessing prompts, it’s more effective to postprocess the LLM outputs to align the outputs to desired outcomes

D.

Write a MLflow PyFunc model that has a separate function to process the prompts

Question 5

A company has a typical RAG-enabled, customer-facing chatbot on its website.

Question # 5

Select the correct sequence of components a user's questions will go through before the final output is returned. Use the diagram above for reference.

Options:

A.

1.embedding model, 2.vector search, 3.context-augmented prompt, 4.response-generating LLM

B.

1.context-augmented prompt, 2.vector search, 3.embedding model, 4.response-generating LLM

C.

1.response-generating LLM, 2.vector search, 3.context-augmented prompt, 4.embedding model

D.

1.response-generating LLM, 2.context-augmented prompt, 3.vector search, 4.embedding model

Question 6

A Generative AI Engineer is creating an agent-based LLM system for their favorite monster truck team. The system can answer text based questions about the monster truck team, lookup event dates via an API call, or query tables on the team’s latest standings.

How could the Generative AI Engineer best design these capabilities into their system?

Options:

A.

Ingest PDF documents about the monster truck team into a vector store and query it in a RAG architecture.

B.

Write a system prompt for the agent listing available tools and bundle it into an agent system that runs a number of calls to solve a query.

C.

Instruct the LLM to respond with “RAG”, “API”, or “TABLE” depending on the query, then use text parsing and conditional statements to resolve the query.

D.

Build a system prompt with all possible event dates and table information in the system prompt. Use a RAG architecture to lookup generic text questions and otherwise leverage the information in the system prompt.

Question 7

A Generative Al Engineer is deciding between using LSH (Locality Sensitive Hashing) and HNSW (Hierarchical Navigable Small World) for indexing their vector database Their top priority is semantic accuracy

Which approach should the Generative Al Engineer use to evaluate these two techniques?

Options:

A.

Compare the cosine similarities of the embeddings of returned results against those of a representative sample of test inputs

B.

Compare the Bilingual Evaluation Understudy (BLEU) scores of returned results for a representative sample of test inputs

C.

Compare the Recall-Onented-Understudy for Gistmg Evaluation (ROUGE) scores of returned results for a representative sample of test inputs

D.

Compare the Levenshtein distances of returned results against a representative sample of test inputs

Question 8

A Generative Al Engineer is responsible for developing a chatbot to enable their company’s internal HelpDesk Call Center team to more quickly find related tickets and provide resolution. While creating the GenAI application work breakdown tasks for this project, they realize they need to start planning which data sources (either Unity Catalog volume or Delta table) they could choose for this application. They have collected several candidate data sources for consideration:

call_rep_history: a Delta table with primary keys representative_id, call_id. This table is maintained to calculate representatives’ call resolution from fields call_duration and call start_time.

transcript Volume: a Unity Catalog Volume of all recordings as a *.wav files, but also a text transcript as *.txt files.

call_cust_history: a Delta table with primary keys customer_id, cal1_id. This table is maintained to calculate how much internal customers use the HelpDesk to make sure that the charge back model is consistent with actual service use.

call_detail: a Delta table that includes a snapshot of all call details updated hourly. It includes root_cause and resolution fields, but those fields may be empty for calls that are still active.

maintenance_schedule – a Delta table that includes a listing of both HelpDesk application outages as well as planned upcoming maintenance downtimes.

They need sources that could add context to best identify ticket root cause and resolution.

Which TWO sources do that? (Choose two.)

Options:

A.

call_cust_history

B.

maintenance_schedule

C.

call_rep_history

D.

call_detail

E.

transcript Volume

Question 9

A Generative AI Engineer is using LangGraph to define multiple tools in a single agentic application. They want to enable the main orchestrator LLM to decide on its own which tools are most appropriate to call for a given prompt. To do this, they must determine the general flow of the code. Which sequence will do this?

Options:

A.

1. Define or import the tools 2. Add tools and LLM to the agent 3. Create the ReAct agent

B.

1. Define or import the tools 2. Define the agent 3. Initialize the agent with ReAct, the LLM, and the tools

C.

1. Define the tools 2. Load each tool into a separate agent 3. Instruct the LLM to use ReAct to call the appropriate agent

D.

1. Define the tools inside the agents 2. Load the agents into the LLM 3. Instruct the LLM to use COT reasoning to determine the appropriate agent

Question 10

What is the most suitable library for building a multi-step LLM-based workflow?

Options:

A.

Pandas

B.

TensorFlow

C.

PySpark

D.

LangChain

Question 11

A Generative Al Engineer has created a RAG application to look up answers to questions about a series of fantasy novels that are being asked on the author’s web forum. The fantasy novel texts are chunked and embedded into a vector store with metadata (page number, chapter number, book title), retrieved with the user’s query, and provided to an LLM for response generation. The Generative AI Engineer used their intuition to pick the chunking strategy and associated configurations but now wants to more methodically choose the best values.

Which TWO strategies should the Generative AI Engineer take to optimize their chunking strategy and parameters? (Choose two.)

Options:

A.

Change embedding models and compare performance.

B.

Add a classifier for user queries that predicts which book will best contain the answer. Use this to filter retrieval.

C.

Choose an appropriate evaluation metric (such as recall or NDCG) and experiment with changes in the chunking strategy, such as splitting chunks by paragraphs or chapters.

Choose the strategy that gives the best performance metric.

D.

Pass known questions and best answers to an LLM and instruct the LLM to provide the best token count. Use a summary statistic (mean, median, etc.) of the best token counts to choose chunk size.

E.

Create an LLM-as-a-judge metric to evaluate how well previous questions are answered by the most appropriate chunk. Optimize the chunking parameters based upon the values of the metric.

Question 12

A Generative Al Engineer is building an LLM-based application that has an

important transcription (speech-to-text) task. Speed is essential for the success of the application

Which open Generative Al models should be used?

Options:

A.

L!ama-2-70b-chat-hf

B.

MPT-30B-lnstruct

C.

DBRX

D.

whisper-large-v3 (1.6B)

Question 13

A Generative AI Engineer has been asked to build an LLM-based question-answering application. The application should take into account new documents that are frequently published. The engineer wants to build this application with the least cost and least development effort and have it operate at the lowest cost possible.

Which combination of chaining components and configuration meets these requirements?

Options:

A.

For the application a prompt, a retriever, and an LLM are required. The retriever output is inserted into the prompt which is given to the LLM to generate answers.

B.

The LLM needs to be frequently with the new documents in order to provide most up-to-date answers.

C.

For the question-answering application, prompt engineering and an LLM are required to generate answers.

D.

For the application a prompt, an agent and a fine-tuned LLM are required. The agent is used by the LLM to retrieve relevant content that is inserted into the prompt which is given to the LLM to generate answers.

Question 14

A Generative Al Engineer is setting up a Databricks Vector Search that will lookup news articles by topic within 10 days of the date specified An example query might be "Tell me about monster truck news around January 5th 1992". They want to do this with the least amount of effort.

How can they set up their Vector Search index to support this use case?

Options:

A.

Split articles by 10 day blocks and return the block closest to the query.

B.

Include metadata columns for article date and topic to support metadata filtering.

C.

pass the query directly to the vector search index and return the best articles.

D.

Create separate indexes by topic and add a classifier model to appropriately pick the best index.

Question 15

A Generative AI Engineer is building an LLM to generate article summaries in the form of a type of poem, such as a haiku, given the article content. However, the initial output from the LLM does not match the desired tone or style.

Which approach will NOT improve the LLM’s response to achieve the desired response?

Options:

A.

Provide the LLM with a prompt that explicitly instructs it to generate text in the desired tone and style

B.

Use a neutralizer to normalize the tone and style of the underlying documents

C.

Include few-shot examples in the prompt to the LLM

D.

Fine-tune the LLM on a dataset of desired tone and style

Question 16

A Generative AI Engineer received the following business requirements for an external chatbot.

The chatbot needs to know what types of questions the user asks and routes to appropriate models to answer the questions. For example, the user might ask about upcoming event details. Another user might ask about purchasing tickets for a particular event.

What is an ideal workflow for such a chatbot?

Options:

A.

The chatbot should only look at previous event information

B.

There should be two different chatbots handling different types of user queries.

C.

The chatbot should be implemented as a multi-step LLM workflow. First, identify the type of question asked, then route the question to the appropriate model. If it’s an upcoming event question, send the query to a text-to-SQL model. If it’s about ticket purchasing, the customer should be redirected to a payment platform.

D.

The chatbot should only process payments

Question 17

After changing the response generating LLM in a RAG pipeline from GPT-4 to a model with a shorter context length that the company self-hosts, the Generative AI Engineer is getting the following error:

Question # 17

What TWO solutions should the Generative AI Engineer implement without changing the response generating model? (Choose two.)

Options:

A.

Use a smaller embedding model to generate

B.

Reduce the maximum output tokens of the new model

C.

Decrease the chunk size of embedded documents

D.

Reduce the number of records retrieved from the vector database

E.

Retrain the response generating model using ALiBi

Question 18

A Generative AI Engineer has created a RAG application which can help employees retrieve answers from an internal knowledge base, such as Confluence pages or Google Drive. The prototype application is now working with some positive feedback from internal company testers. Now the Generative Al Engineer wants to formally evaluate the system’s performance and understand where to focus their efforts to further improve the system.

How should the Generative AI Engineer evaluate the system?

Options:

A.

Use cosine similarity score to comprehensively evaluate the quality of the final generated answers.

B.

Curate a dataset that can test the retrieval and generation components of the system separately. Use MLflow’s built in evaluation metrics to perform the evaluation on the retrieval and generation components.

C.

Benchmark multiple LLMs with the same data and pick the best LLM for the job.

D.

Use an LLM-as-a-judge to evaluate the quality of the final answers generated.

Question 19

A Generative Al Engineer is building a system that will answer questions on currently unfolding news topics. As such, it pulls information from a variety of sources including articles and social media posts. They are concerned about toxic posts on social media causing toxic outputs from their system.

Which guardrail will limit toxic outputs?

Options:

A.

Use only approved social media and news accounts to prevent unexpected toxic data from getting to the LLM.

B.

Implement rate limiting

C.

Reduce the amount of context Items the system will Include in consideration for its response.

D.

Log all LLM system responses and perform a batch toxicity analysis monthly.

Question 20

Generative AI Engineer at an electronics company just deployed a RAG application for customers to ask questions about products that the company carries. However, they received feedback that the RAG response often returns information about an irrelevant product.

What can the engineer do to improve the relevance of the RAG’s response?

Options:

A.

Assess the quality of the retrieved context

B.

Implement caching for frequently asked questions

C.

Use a different LLM to improve the generated response

D.

Use a different semantic similarity search algorithm

Question 21

Which TWO chain components are required for building a basic LLM-enabled chat application that includes conversational capabilities, knowledge retrieval, and contextual memory?

Options:

A.

(Q)

B.

Vector Stores

C.

Conversation Buffer Memory

D.

External tools

E.

Chat loaders

F.

React Components