Summer Sale- Special Discount Limited Time 65% Offer - Ends in 0d 00h 00m 00s - Coupon code: netdisc

CompTIA DA0-001 CompTIA Data+ Certification Exam Exam Practice Test

Page: 1 / 26
Total 262 questions

CompTIA Data+ Certification Exam Questions and Answers

Question 1

Which of the following tools would be best to use to calculate the interquartile range, median, mean, and standard deviation of a column in a table that has 5.000.000 rows?

Options:

A.

Microsoft Excel

B.

R

C.

Snowflake

D.

SQL

Question 2

Which of the following are reasons to conduct data cleansing? (Select two).

Options:

A.

To perform web scraping

B.

To track KPls

C.

To improve accuracy

D.

To review data sets

E.

To increase the sample size

F.

To calculate trends

Question 3

A data analyst needs to present the results of an online marketing campaign to the marketing manager. The manager wants to see the most important KPIs and measure the return on marketing investment. Which of the following should the data analyst use to BEST communicate this information to the manager?

Options:

A.

A real-time monitor that allows the manager to view performance the day the campaign was launched

B.

A sell-service dashboard that allows the manager to look at the company's annual budget performance

C.

A spreadsheet of the raw data from all marketing campaigns and channels

D.

A summary with statistics, conclusions, and recommendations from the data analyst

Question 4

An analyst has received the requirements for an internal user dashboard. The analyst confirms the data sources and then creates a wireframe. Which of the following is the NEXT step the analyst should take in the dashboard creation process?

Options:

A.

Optimize the dashboard.

B.

Create subscriptions.

C.

Get stakeholder approval.

D.

Deploy to production.

Question 5

Given the information in the following tables:

Question # 5

Which of the following describes merging these tables to create a master file that includes all transactions for both online and in-store sales?

Options:

A.

Data audit

B.

Data completeness

C.

Data validation

D.

Data consolidation

Question 6

Given the table below:

Question # 6

Which of the following boxes indicates that a Type Il error has occurred?

Options:

A.

1

B.

2

C.

3

D.

4

Question 7

Which of the following best describes how discrete data differs from continuous data?

Options:

A.

Discrete data cannot create a sloped line.

B.

Discrete data can only be a finite number of values.

C.

Discrete data can have decimal points.

D.

Discrete data applies only to numbers.

Question 8

Which one the following is not considered an aggregate function?

Options:

A.

SUM

B.

MIN

C.

SELECT

D.

MAX

Question 9

The ACME Corporation hired an analyst to detect data quality issues in their Excel documents. Which of the following are the most common issues? (Select TWO)

Options:

A.

Apostrophe.

B.

Commas.

C.

Symbols.

D.

Duplicates.

E.

Misspellings.

Question 10

Which of the following statistical methods requires two or more categorical variables?

Options:

A.

Simple linear regression

B.

Chi-squared test

C.

Z-test

D.

Two-sample t-test

Question 11

An analyst needs to provide a chart to identify the composition between the categories of the survey response data set:

Question # 11

Which of the following charts would be BEST to use?

Options:

A.

Histogram

B.

Pie

C.

Line

D.

Scatter pot

E.

Waterfall

Question 12

A development company is constructing a new Init in its apartment complex. The complex has the following floor plans:

Question # 12

Using the average cost per square foot of the original floor plans. which of the following should be the price of the Rose Init?

Options:

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Question 13

You have two databases tables that you would like to join together using a foreign key relationship.

What term best describes this action?

Options:

A.

Blending.

B.

Appending.

C.

Mixing.

D.

Merging.

Question 14

Given the table below:

Question # 14

Which of the following variables can be considered inconsistent, and how many distinct values should the variable have?

Options:

A.

Name, one

B.

Gender, two

C.

Level, three

D.

Code, four

E.

Region, five

Question 15

A data analyst received a large amount of third-party data that needs to be joined with in-house data files. After the data is joined, the analyst notices three columns all contain dates. Which of the following should the analyst do to maintain data consistency?

Options:

A.

Append all date columns and parse the strings.

B.

Impute all three date columns and then merge.

C.

Merge all date columns and unify the format.

D.

Separate the columns into a table and merge.

Question 16

The number of phone calls that the call center receives in a day is an example of:

Options:

A.

continuous data.

B.

categorical data.

C.

ordinal data.

D.

discrete data.

Question 17

The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company’s year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?

Options:

A.

A Q2 2020 and Q4 2019

B.

YTD 2020 and YTD 2019

C.

Q2 2020 and Q2 2019

D.

Q2 2020 and Q2 2021

Question 18

A military commander would like to see the health scorecards of the troops daily and filter them based on gender and rank. Considering this data is PHI, which of the following would be the best way for the commander to view the information?

Options:

A.

An emailed report

B.

A password-protected dashboard

C.

A daily printout of a report

D.

A cloud-hosted spreadsheet

Question 19

A user receives a large custom report to track company sales across various date ranges. The user then completes a series of manual calculations for each date range. Which of the following should an analyst suggest so the user has a dynamic, seamless experience?

Options:

A.

Create multiple reports, one for each needed date range.

B.

Build calculations into the report so they are done automatically.

C.

Add macros to the report to speed up the filtering and calculations process.

D.

Create a dashboard with a date range picker and calculations built in.

Question 20

Jenny wants to study the academic performance of undergraduate sophomores and wants to determine the average grade point average at different points during an academic year.

What best describes the data set she needs?

Options:

A.

Sample.

B.

Observation.

C.

Variable.

D.

Population.

Question 21

Under which of the following circumstances should the null hypothesis be accepted when a = 0.05?

Options:

A.

When p is 0.00003

B.

When p is 0.001

C.

When p is 0.04

D.

When p is 0.06

Question 22

Question # 22

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D. over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Question 23

A data analyst has been asked to derive a new variable labeled “Promotion_flag” based on the total quantity sold by each salesperson. Given the table below:

Question # 23

Which of the following functions would the analyst consider appropriate to flag “Yes” for every salesperson who has a number above 1,000,000 in the Quantity_sold column?

Options:

A.

Date

B.

Mathematical

C.

Logical

D.

Aggregate

Question 24

An analyst modified a data set that had a number of issues. Given the original and modified versions:

Question # 24

Which of the following data manipulation techniques did the analyst use?

Options:

A.

Imputation

B.

Recoding

C.

Parsing

D.

Deriving

Question 25

A report is scheduled to run and be distributed at the end of business each day. On Mondays, one of the recipients opens the previous week's reports and combines them to calculate the weekly totals and projections for the coming week. This is a tedious process, and the recipient asks an analyst for help. Which of the following should the analyst recommend?

Options:

A.

Add calculation fields to the daily report so the totals are built in.

B.

Create a new report with weekly totals set to run at the end of business on Friday.

C.

Provide a daily summary to the report with totals to save the user the effort of manual calculations.

D.

Reduce the frequency of the report to once a week and change the date range.

Question 26

A collections manager has a team calling customers who are past due on their accounts in an attempt to collect payments. The manager receives the call list in the form of a printed report that is generated by the accounting department at the beginning of each week. Consequently, the collections team calls some customers who have made payments in the time since the report was last printed. Which of the following reporting enhancements could the accounting department implement to best reduce the number of calls on current accounts?

Options:

A.

Modify the date range on the report

B.

Include a time stamp on the report.

C.

Increase the frequency of report generation.

D.

Add a report run date to the report.

Question 27

When analyzing the values of two variables, you decide to convert both variables so they are on a scale of 0 to 1.

What term describes this action?

Options:

A.

Filtering.

B.

Normalization.

C.

Transposition.

D.

Aggregation.

Question 28

A data analyst has received a data set that contains actual and projected sales for the fourth quarter of 2019. Which of the following statistical methods should the analyst use to find the measure of dispersion?

Options:

A.

Mean

B.

Variance

C.

Correlation

D.

Confidence interval

Question 29

Different people manually type a series of handwritten surveys into an online database. Which of the following issues will MOST likely arise with this data? (Choose two.)

Options:

A.

Data accuracy

B.

Data constraints

C.

Data attribute limitations

D.

Data bias

E.

Data consistency

F.

Data manipulation

Question 30

Which of the following is an example of a flat file?

Options:

A.

CSV file

B.

PDF file

C.

JSON file

D.

JPEG file

Question 31

Given the customer table below:

Question # 31

Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?

Options:

A.

Pie chart

B.

Heat graph

C.

Scatter plot

D.

Line chart

Question 32

A data analyst needs to create a dashboard to help identify trends in the data sets. Which of the following is an appropriate consideration for dashboard development?

Options:

A.

Data sources and attributes

B.

Frequently asked questions

C.

A report from the data source

D.

A comparison of data sets

Question 33

A data analyst has a set of data that shows the number of gallons of oil produced each day. The company would like to know the standard deviation for the data set. The variance for the data is 36 gallons. Which of the following is the standard deviation for gallons produced?

Options:

A.

1.16

B.

6

C.

36

D.

72

Question 34

Which of the following variable name formats would be problematic if used in the majority of data software programs?

Options:

A.

First_Name_

B.

FirstName

C.

First_Name

D.

First Name

Question 35

Which of the following contains alphanumeric values?

Options:

A.

10.1Ε²

B.

13.6

C.

1347

D.

A3J7

Question 36

Each month an analyst needs to execute a data pull for the two prior months. Which of the following is the most efficient function for the analyst to use?

Options:

A.

Logical

B.

Date

C.

Aggregate

D.

System

Question 37

You should always choose the analytics tool that is most appropriate for any given situation, even if that means acquiring a new tool.

Options:

A.

True.

B.

False.

Question 38

‘Which of the following is the BEST reason to use database views instead of tables?

Options:

A.

Views reduce the need for repetitive, complex data joins.

B.

Views allow for the storage of temporary data. whereas tables do not.

C.

Views allow for the joining of multiple data sources, whereas tables do not.

D.

Views can be used to restrict sensitive information.

Question 39

Which of the following database schemas features normalized dimension tables?

Options:

A.

Flat

B.

Snowflake

C.

Hierarchical

D.

Star

Question 40

An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?

Options:

A.

Complete an audit on the data pulled for the report.

B.

Complete a check for quality in the report.

C.

Complete a review of the data and a check for consistency

D.

Complete a trend analysis to be included in the report.

Question 41

Consider this dataset showing the retirement age of 11 people, in whole years:

54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60

This tables show a simple frequency distribution of the retirement age data.

Question # 41

Options:

A.

56

B.

55

C.

57

D.

54

Question 42

After completing web scraping, which of the following file formats needs to be parsed?

Options:

A.

.html

B.

.txt

C.

.csv

D.

.tsv

Question 43

Given the following graph:

Question # 43

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D, over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Question 44

A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:

Question # 44

Customer Table -

In-store Transactions –

Question # 44

Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?

Options:

A.

INNER: 6 rows; LEFT: 9 rows

B.

INNER: 9 rows; LEFT: 6 rows

C.

INNER: 9 rows; LEFT: 15 rows

D.

INNER: 15 rows; LEFT: 9 rows

Question 45

Which of the following data types would a telephone number formatted as XXX-XXX-XXXX be considered?

Options:

A.

Numeric

B.

Date

C.

Float

D.

Text

Question 46

What analytics suite is offered by Microsoft and directly integrates with SQL Server Databases?

Options:

A.

Qlik.

B.

Power BI.

C.

Domo.

D.

Dataroma.

Question 47

A data analyst is working with a team to create a dashboard for a client who requires on-demand access. Which of the following is the best delivery method to support the clients’ requirement?

Options:

A.

Email

B.

Scheduled

C.

Subscription

D.

Static

Question 48

A research analyst wants to determine whether the data being analyzed is connected to other datapoints. Which of the following is the BEST type of analysis to conduct?

Options:

A.

Trend analysis

B.

Performance analysis

C.

Link analysis

D.

Exploratory analysis

Question 49

Which of the following is the correct data type for text?

Options:

A.

Boolean

B.

String

C.

Integer

D.

Float

Question 50

Which of the following is a domain-specific language used in programming that is designed for managing data that is held in a relational data stream management system?

Options:

A.

SAS

B.

SQL

C.

Python

D.

R

Question 51

A data analyst for a media company needs to determine the most popular movie genre. Given the table below:

Question # 51

Which of the following must be done to the Genre column before this task can be completed?

Options:

A.

Append

B.

Merge

C.

Concatenate

D.

Delimit

Question 52

Which of the following is a common data analytics tool that is also used as an interpreted, high-level, general-purpose programming language?

Options:

A.

SAS

B.

Microsoft Power BI

C.

IBM SPSS

D.

Python

Question 53

A research analyst collects ten data points from 1.000 specimens. The analyst will not need any additional data to complete the analysis and will not need to retrieve information by specifier. Which of the following is the best data structure for the analyst to use?

Options:

A.

NoSQL

B.

Flat file

C.

JSON

D.

Relational database

Question 54

Which of the following best describes the law of large numbers?

Options:

A.

As a sample size decreases, its standard deviation gets closer to the average of the whole population.

B.

As a sample size grows, its mean gets closer to the average of the whole population

C.

As a sample size decreases, its mean gets closer to the average of the whole population.

D.

When a sample size doubles. the sample is indicative of the whole population.

Question 55

An analyst is working with the income data of suburban families in the United States. The data set has a lot of outliers, and the analyst needs to provide a measure that represents the typical income. Which of the following would BEST fulfill the analyst’s goal?

Options:

A.

Median

B.

Mean

C.

Mode

D.

Standard deviation

Question 56

A data analyst is creating a report that will provide information about various regions, products, and time periods. Which of the following formats would be the most efficient way to deliver this report?

Options:

A.

A workbook with multiple tabs for each region

B.

A daily email with snapshots of regional summaries

C.

A static report with a different page for every filtered view

D.

A dashboard with filters at the top that the user can toggle

Question 57

An analyst needs to conduct a quick analysis. Which of the following is the FIRST step the analyst should perform with the data?

Options:

A.

Conduct an exploratory analysis and use descriptive statistics.

B.

Conduct a trend analysis and use a scatter chart.

C.

Conduct a link analysis and illustrate the connection points.

D.

Conduct an initial analysis and use a Pareto chart.

Question 58

Which of the following is the first step an analyst should perform upon receiving a business request for analysis?

Options:

A.

Determine the data needs and sources for analysis.

B.

Initiate the analysis for exploratory data analysis.

C.

Review the business questions to understand the scope.

D.

Finalize the methodology to solve the problem.

Question 59

Which of the following is a difference between a primary key and a unique key?

Options:

A.

A unique key cannot take null values, whereas a primary key can take null values.

B.

There can be only one primary key in a data set, whereas there can be multiple unique keys.

C.

A primary key can take a value more than once, whereas a unique key cannot take a value more than once.

D.

A primary key cannot be a date variable, whereas a unique key can be.

Question 60

Which of the following would be considered non-personally identifiable information?

Options:

A.

Cell phone device name

B.

Customer’s name

C.

Government ID number

D.

Telephone number

Question 61

A table in a hospital database has a column for patient height in inches and a column for patient height in centimeters. This is an example of:

Options:

A.

dependent data.

B.

duplicate data.

C.

invalid data

D.

redundant data

Question 62

Given the following data tables:

Question # 62

Which of the following MDM processes needs to take place FIRST?

Options:

A.

Creation of a data dictionary

B.

Compliance with regulations

C.

Standardization of data field names

D.

Consolidation of multiple data fields

Question 63

A data analyst is attempting to understand how ice cream consumption is affected by different attributes. such as cost, temperature. and income level. Which of the following

regression analyses should the data analyst perform to understand this relationship?

Options:

A.

Logistic

B.

Ordinary least squares

C.

Cox

D.

Polynomial

Question 64

Amanda needs to create a dashboard that will draw information from many other data sources and present it to business leaders.

Which one of the following tools is least likely to meet her needs?

Options:

A.

QuickSight.

B.

Tableau.

C.

Power BI.

D.

SPSS Modeler.

Question 65

Kelly wants to get feedback on the final draft of a strategic report that has taken her six months to develop.

What can she do to get prevent confusion as see seeks feedback before publishing the report?

Choose the best answer.

Options:

A.

Distribute the report to the appropriate stakeholders via email.

B.

Use a watermark to identify the report as a draft.

C.

Show the report to her immediate supervisor.

D.

Publish the report on an internally facing website.

Question 66

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Question # 66

Which of the following conclusions is accurate at a 95% confidence interval?

Options:

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Question 67

Which of the following descriptive statistical methods are measures of central tendency? (Choose two.)

Options:

A.

Mean

B.

Minimum

C.

Mode

D.

Variance

E.

Correlation

F.

Maximum

Question 68

Which of the following technologies would be best suited for creating a multiple linear regression model?

Options:

A.

Microsoft Power Bl

B.

R

C.

SQL

D.

Tableau

Question 69

An analyst has generated a report that includes the number of months in the first two quarters of 2019 when sales exceeded $50,000:

Question # 69

Which of the following functions did the analyst use to generate the data in the Sales_indicator column?

Options:

A.

Aggregate

B.

Logical

C.

Date

D.

Sort

Question 70

Which of the following are reasons to create and maintain a data dictionary? (Choose two.)

Options:

A.

To improve data acquisition

B.

To remember specifics about data fields

C.

To specify user groups for databases

D.

To provide continuity through personnel turnover

E.

To confine breaches of PHI data

F.

To reduce processing power requirements

Question 71

An employer needs to maintain adequate office staffing during the winter and wants to track storm data. Which of the following data collection methods should the employer use?

Options:

A.

Web scraping

B.

Public databases

C.

Observations

D.

Weather surveys

Question 72

Given the following:

Question # 72

Which of the following is the most important thing for an analyst to do when transforming the table for a trend analysis?

Options:

A.

Fill in the missing cost where it is null.

B.

Separate the table into two tables and create a primary key

C.

Replace the extended cost field with a calculated field.

D.

Correct the dates so they have the same format.

Question 73

A data analyst for a media company needs to determine the most popular movie genre. Given the table below:

Question # 73

Which of the following must be done to the Genre column before this task can be completed?

Options:

A.

Append

B.

Merge

C.

Concatenate

D.

Delimit

Question 74

Which of the following would be the best way to identify multicollinear attributes in a data set?

Options:

A.

Correlation coefficient

B.

Chi-squared test

C.

Two-sample f-test

D.

Two-way ANOVA

Question 75

Which of the following BEST describes standard deviation?

Options:

A.

A measure that is used to establish a relationship between two variables

B.

A measure of how data is distributed

C.

A measure of the amount of dispersion of a set of values

D.

A measure that is used to find the significant difference between variables

Question 76

Which one of the following values will appear first if they are sorted in descending order?

Options:

A.

Aaron.

B.

Molly.

C.

Xavier.

D.

Adam.

Question 77

Taylor wants to investigate how manufacturing, marketing, and sales expenditures impact overall profitability for her company.

Which of the following systems is the most appropriate?

Options:

A.

OLTP.

B.

OLAP.

C.

Data warehouse.

D.

Data mart.

Question 78

Five dogs have the following heights in millimeters:

300,430, 170, 470, 600

Which of the following is the standard deviation for the five dogs?

Options:

A.

147mm

B.

154mm

C.

394 mm

D.

21,704mm

Page: 1 / 26
Total 262 questions