Data Analysis on unforgettable.me: Getting Started and Descriptive Statistics

unforgettable.me provides a data analysis system that allows you to quickly and easily conduct analyses without ever gaining access to the raw data - thus creating a first ring of privacy protection for the participants. The system provides mechanisms to transform the data in preparation for analysis, a set of plotting tools as well as a set of analyses including descriptive statistics,  (Chi2), correlation, ANOVA, regression and principal components analyses.

To begin, login to unforgettable.me and click on the Marketplace tab:

If you haven't already, click on the link to register to become a researcher. This happens automatically.

Then select Projects under Researcher on the left hand side:

navigate to the project entitled "Demo Project (Event Segmentation Project V2)" and click Analyze:

You should now see the project title and a list of buttons that allow you to create analyses and plots of this data set.

Click on "Descriptives" to create your first analysis:

On the left hand side of the screen you will see the controls for specifying the parameters of the analyses you create. The results will appear on the right hand side.

The Event Segmentation Project required participants to collect data for two weeks. They used the Unforgettable app to collect GPS, accelerometry and audio data. The events that appeared in their calendars were recorded using IFTTT. In addition, seven times a day, the SEMA 3 app asked them to record information about the last event in which they were involved - including who they were with, what they were doing and where they were. Finally, they filled in the following surveys:

The questions asked can be seen by clicking on the Surveys tab of the unforgettable.me website.

Under the "Select From" list you can see a list of all of the variables that you have to work with in this project. The start of each variable name indicates where the data came from. __App__ means it came from the unforgettable.me app, __IFTTTGoogleCalendar__ means that it came through IFTTT from the participants’ Google Calendar. __SEMAg5zn5ggo_B__ means that it came from the SEMA survey number g5zn5ggo_B. This was the second iteration (B) of a custom SEMA survey that we created for this project. __USurveyDemographic__ means the data came from the unforgettable.me demographic survey. _USurveyGeographic means that the data came from the unforgettable.me geographic survey and the data from the other surveys are named similarly.

To run your first analysis, select the __App__Latitude in the "Select From" list and then press the arrow button. It will move from the "Select From" list to the "Selected" list. Now click the Update button.

On the right hand side you will see the corresponding results. Count refers to the number of data points. In this demonstration project there are 10 participants. Then you see the mean, std, min, 25% percentile, 50% percentile (median), 75% percentile and max of the latitudes along with the number of data values that were missing (none in this case). You may have to scroll using the lower-right scroll-bar to see all variables.

Now let's suppose you don't want the statistics of __App__Latitude after all. Rather you want __App__Longitude and __App__Temperture instead. You can move __App__Latitude back to the "Select From" list by clicking on it (note the arrow changes direction):[a][b]

and then clicking on the arrow. Now select __App__Longitude from the "Select From" list and press the arrow, and select __App__Temperature from the "Select From" list and press the arrow:

Click Update and you will see the new results on the right hand side:[c][d]

So far we have considered only numeric variables, but one can also generate descriptive statistics on categorical variables. Add __USurveyDemographic__Education, __USurveyDemographic__Gender and __USurveyGeographic__CountryOfBirth to the selected list and click Update. You will see the following results:

You have now completed your first analyses using the variables that the unforgettable.me system provides. In subsequent tutorials, we will cover how the system organizes the data for entry into the analyses, how to use the plotting capabilities and each of the analyses that are available.