Data analysis projects
A Startup sells food products. Investigate user behavior for the company’s app.
Each log entry is a user action or an event
EventName — event name
DeviceIDHash — unique user identifier
EventTimestamp — event time
ExpId — experiment number: 246 and 247 are the control groups, 248 is the test group
Then look at the results of an A/A/B test. The designers would like to change the fonts for the entire app, but the managers are afraid the users might find the new design intimidating. They decide to make a decision based on the results of an A/A/B test.
The users are split into three groups: two control groups get the old fonts and one test group gets the new ones. Find out which set of fonts produces better results.
Creating two A groups has certain advantages. We can make it a principle that we will only be confident in the accuracy of our testing when the two control groups are similar. If there are significant differences between the A groups, this can help us uncover factors that may be distorting the results. Comparing control groups also tells us how much time and data we’ll need when running further tests.
We have prepocessed the data and the following changes have been made:
event_datetime column converted to datetime type; dates were extracted on separate column event_date and time exracted and saved to event_time columns.

As we see from histogram above, all the events occured from 2019-07-31, meaning on the second week of the testing period which is the actual period that our data represents. Thus, we can filter out data and keep only relevant ones which is after 2019-07-31.


48 % of users make the entire journey from their first event to payment.


Note: Critical significance level (alpha) : .05
In the final step, we have studied the results of the experiment and conducted several hypotheses. Followings are the outcomes:
MainScreenAppear is the most popular event in both control groups. For group 246, 2450 users performed that action and for control group 247, 2476 users performed the same action. 98 % of total users of each group have performed this action. The outcomes tells us that the groups were split properly since we failed to reject the null hypothesis that implied there is statistically significant difference between groups.