Data-Analysis-Portfolio

Data analysis projects


Project maintained by rasulov94 Hosted on GitHub Pages — Theme by mattgraham

Table of Contents

1. Project and task description

2. Data Preprocessing

3. Study and check the data

4. Study the event funnel

5. Study the results of the experiment

Project Description

A Startup sells food products. Investigate user behavior for the company’s app.

Each log entry is a user action or an event
EventName — event name
DeviceIDHash — unique user identifier
EventTimestamp — event time
ExpId — experiment number: 246 and 247 are the control groups, 248 is the test group

Then look at the results of an A/A/B test. The designers would like to change the fonts for the entire app, but the managers are afraid the users might find the new design intimidating. They decide to make a decision based on the results of an A/A/B test.

The users are split into three groups: two control groups get the old fonts and one test group gets the new ones. Find out which set of fonts produces better results.

Creating two A groups has certain advantages. We can make it a principle that we will only be confident in the accuracy of our testing when the two control groups are similar. If there are significant differences between the A groups, this can help us uncover factors that may be distorting the results. Comparing control groups also tells us how much time and data we’ll need when running further tests.

Task

Data Preprocessing

We have prepocessed the data and the following changes have been made:

  1. Columns were renamed for the sake of readability.
  2. No missing values were found.
  3. 413 duplicated values were detected and removed.
  4. event_datetime column converted to datetime type; dates were extracted on separate column event_date and time exracted and saved to event_time columns.

Study and check the data

  1. Average number of events per user.

  1. What period of time does the data cover?

As we see from histogram above, all the events occured from 2019-07-31, meaning on the second week of the testing period which is the actual period that our data represents. Thus, we can filter out data and keep only relevant ones which is after 2019-07-31.

Study the event funnel

48 % of users make the entire journey from their first event to payment.

Study the results of the experiment

Note: Critical significance level (alpha) : .05

In the final step, we have studied the results of the experiment and conducted several hypotheses. Followings are the outcomes:

Reference: Jupyter notebook