New Standard Operating Procedures

From Your Arrival to Your First Publication

(Last Revised 5-22-2017)

A Note to New Researchers:  We are fortunate to have a dynamic group of people in our Lab. Other than the full-time staff and post-docs, about two-thirds of the people in the Lab throughout the year visit for less than a year – most for only three months. Following their time with us, they return to their home country or home university. While here, these visiting researchers join on-going projects, start new ones, and become immersed in our day-to-day work and adventures. Visitors bring a wide range of experience from different disciplines that each have their own norms and standards for research.

When people leave the Lab and return home, they have a new life. What they’ve started or worked on while here becomes a past chapter in their life.  It becomes difficult for them to complete these past projects, or for new researchers to pick up where they left off. This lack of continuity can be challenging.

Having an explicit set of Standard Operating Procedures (SOPs) for research in our Lab helps provide continuity, and better supervision and quality control. Some of these procedures will take extra time and may initially seem like an added burden. However, these procedures will ultimately lead to better discoveries, greater ease in following up on projects and fewer “orphaned” projects. These SOPs will also help minimize some unintended research errors.  Furthermore, considering that data management practices and rules are rapidly changing, the implementation of SOPs can help the Lab to respond to change in requirements.

A. Let’s Start Before You Arrive

When you learn that you are accepted into the Lab (or hired) it is time to begin preparing. If you’re hired as a new staff member or research assistant, you will be asked to complete five tasks within your first week on the job. If you’re a new researcher – visiting professor, post-doc, visiting graduate student, or intern – we ask that you complete these before you arrive. They’ll be the subject of your Day 1 orientation:

            [] A1. Take the on-line IRB training and become Certified      

            [] A2. Complete the Solving Behavioral Mysteries Worksheet

            [] A3. Read at least 2 of the articles on scientific replication

            [] A4. Read the website and links found at:

            [] A5. Read this SOP document.

B. Planning and Designing Studies

Research ideas usually result from two or three people brainstorming a concept, writing an IRB application, running the study, and analyzing the data to see if it worked. But we could design even better studies, if more heads were involved.

We’re now setting aside time in the Wednesday Workshop to brainstorm, critique, and modify any new study idea that is going to be submitted to the IRB. Although many are only pilot studies, this will still strengthen them and reduce the need for later course correction.At the planning phase, there are two important steps to ensure project continuity.

The best way we’ll ensure project continuity is by completing a Research Analysis Plan. With this we’ll know what people were thinking when they originally collected the data. The easiest way to do this is at the point when we submit IRB requests. At that point we will also complete and attach the one-page Research Analysis Plan (Appendix A). While IRB doesn’t require this, it will be an easy way to have all of our initial thinking in one place. This will contain four key items: 1) Rationale (background information) and hypothesis(es), 2) Design of experiment/data collection method,  3) Description of the statistical analyses including the exploratory analyses, 4) The file names that will be used.  To underscore, here are the two ways we are maintaining project continuity at the planning phase:             

While creating a Research Analysis Plan will be internally helpful, we will also archive the Research Plan and hypotheses in a site such as CISER or

To determine how many subjects to recruit for a study we will do a sample size calculation prior to the data collection. If no power analysis is possible with current data, we will use a two-step procedure.  First we will run a pilot study and collect a small sample to obtain the needed information to estimate the appropriate sample size for the study. We will use conservative estimates of variance to estimate sample size.

This will take some course correction and adjustment.  We can estimate that we need 80% power to detect a medium-sized effect. For example, when comparing two groups we will need 64 subjects per group to be able to call significant (at an alpha level of 0.05) a medium effect size (0.5) with a two- sided test and power of 80%. This has two other implications for us (since it’s not always easy to get 200 subjects for lab or field studies):  First, we will aim to simplify our designs or make the intervention or manipulations strong while still being realistic.  Second, we will aim for more multi-study papers that involve replications or extensions (even if we just mention a replication in footnote). 

[] B1. Complete the Research Analysis Plan template for each new project and include it at the end of all IRB applications.

[] B2. Preregister hypotheses: Where appropriate, hypotheses will be preregistered and/or archived on a site such as or CISER.  

[] B3. When doing a power analysis from a pilot study to estimate the necessary sample size, we will oversample the target number of subjects and the number of occasions or number sessions.

C. Time to Collect Data

Past data collection in the Lab has largely been self-directed by the lead researcher or first author. Moving forward, a set of standard operating procedures (SOPs) for collecting data will be followed for all projects. One of these procedures will involve closer supervision of both the lab studies and field studies we conduct (see Appendix B). Additionally, where there are multiple studies in a project, an organizing system, such as that in Appendix C, can help us provide easier supervision to the research. 

 [] C1. When conducting a study, a sponsoring professor or post-doc must be present at either the introduction or debriefing of at least 80 percent of a study’s sessions. Graduate students will be supervised when collecting all data (except that from online studies).

[] C2. Proposed studies need to be evaluated by a subset of the Director, Deputy-Director and Lab Manager based on the mission of the Lab (for instance, its potential to provide a healthy eating solution that is scalable) and any risks to the reputation to the Lab or to its operations.

[] C3. Following the study, a detailed procedure and debriefing of the study along with the data (both electronic and any hard copy surveys, coding sheets, etc.) will be given to the Deputy Director and Publications Assistant to archive.

[] C4. A photo of the study set-up, location, and stimuli will be taken for future reference. The photo will be named consistent with our naming convention, and saved with other files. If people gave consent to be taped, a brief 1-minute walkthrough of the procedure can be filmed on a smart phone.

D. Storing and Analyzing Data

The Research Analysis Plan will bring future researchers up to speed on how to save an orphaned project that was left incomplete when a visitor to the Lab leaves to return to their home institution. With dozens of studies being run each year, it can be difficult to keep data sets organized and versioned correctly without a clear and common set of conventions. 

The first step is following the naming and storing convention. For instance, one of the reasons it was so difficult to find the specific data file, such as the Aiello’s pizza buffet data, was that it had been collected nine years earlier, was vaguely named and was located on a computer that had been phased out of use. It would have been easier to find if it had been named “PizzaBuffet-Aiello’s-WhitneyPoint-10-18-08—JN” and archived at central locations.

This is an example of our file naming convention. It includes the following:

  • Code name (ForeignWeight, GlutenFree)
  • Identifying words (Pringles, Aiellos)
  • Location (Lab, 401, US, Mall, MTurk, Panel, or City/State)
  • Date collected or entered (2-21-17)
  • Initials or name of who collected it

Moving forward, here’s how we’ll deal with data storage and how we can bring ourselves up to speed on any previous data analyses that were originally done.

[] D1. The new naming convention will be used for all files. 

[] D2. After data is initially entered, the raw data file will be saved with the suffix “original” and the cleaned file will be saved with the suffix “cleaned” along with notes and corresponding command/script files of all changes that were made in cleaning and the names of those who did the cleaning. This will include the code file that did the data cleaning. 

[] D3. Both files will be given to the Deputy Director and Publications Assistant and archived on a hard drive and on our campus encrypted intranet.

[] D4. All analysis will be produced from a copy of the cleaned data using an analysis script. The script will include all new data definitions or other procedures necessary to conduct the analysis beginning with the cleaned data set. Every definition or reshaping of the data will include comments that can be easily understood by subsequent lab workers. Additionally, comments will indicate analysis that will appear in which tables, figures or text in any resulting manuscript. Comments will also indicate the name of the individual(s) authoring the script. All new data files should be saved. Researchers will save all analysis scripts using the naming convention. (STATA works best for this, but it can be done with SPSS and SAS as well). Any new edits or additional versions of script files will be saved separately again using the naming convention.

[] D5. When a manuscript is ready to submit, the data file and analysis scripts will be verified by the Cornell Statistical Consulting Unit (F6) and sent to CISER for archiving. The process of verification will certify that the numbers appearing in any manuscript are those produced by the script file and the data. The Publications Assistant will take care of this and relay any needed changes back to the authors.

E. Avoiding Duplicate Passages and Publication

Duplicate publication happens when similar or identical sections are reprinted from one paper we’ve written to another.  It can also happen when one paper is revised and targeted at a different population (e.g., practitioners) or extended from a journal article into a book chapter.  It can also occur when a dataset is reused to do additional or more detailed analyses as part of a new paper.

First, to be certain that duplicate passages or publications do not occur, we will run all of our papers through a software program that will help identify any duplicate passages before submission.  This needs to be done before submitting a paper for the first time.  The program we are experimenting with highlights the passages that are similar in other publications on the web. Our policy will be to either direct cite, quote, or rewrite any passage that is two or more sentences.  If extensive revisions are required before the paper is accepted, the author is encouraged to do this again when the paper is conditionally accepted.

Second, there are occasions when it may be appropriate to publish similar findings from an already published data set (such as when invited to write an article for a practitioner journal or when expanding a short journal article into a larger book chapter).  In these cases, both editors will be informed of this and it will be clearly stated that the data or the table has already been published in an earlier article.

In combination, these should prevent publication overlap from happening.

[] E1. Before submitting a paper, the paper needs to be proofed by duplication software to make certain that any duplicate passage of two or more sentences is either cited, quoted, or rewritten.

[] E2.  If substantial revisions are made to a paper before publication, the paper should be reproofed at the conditional acceptance stage.

F. Writing and Submitting the Paper

While our basic paper outlining, writing, and editing process is very efficient, upon reflection, our referencing of unpublished papers can be improved. We will be changing some basic strategies to further improve the contribution of some of our papers with second studies, even if they are smaller confirmatory studies and don’t necessarily extend the theory.

If the data used in the paper is proprietary and cannot be shared, we need to be extremely clear in relation to what we can and cannot share. If data (such as data with supermarket chain sales) is proprietary, we will emphasize this in three different places: 1) in the letter to the editor, 2) in the first paragraph of the methods section, and 3) in the footnote of the paper.

[] F1. Any other papers using part of the same data set, even if unpublished or working papers, will be cited. 

[] F2. Extra care will be taken with exploratory studies to underscore that they are exploratory in the abstract, introduction, results, discussion, and limitations. We will state explicitly whether the analysis is in accordance with the original analysis plan or not.

[] F3. Consider adding more 2nd and 3rd studies to papers, even if they are lab studies or on-line studies.

[] F4. A last check will be made for other internal IRB submissions on the topic of this publication along with web-searches for unpublished papers should be should be repeated prior to submission. Any relevant studies found should be included in the references.

[] F5. If a paper involves proprietary data, we will note this in the letter to the editor, the first paragraph of the methods, and in the acknowledgment and in the author info footnote.

[] F6. Before being submitted, all papers and conference submissions will have their data, and data analysis scripts sent to the Cornell Statistical Consulting Unit (CSCU) for verification (D5). 

G. “Conditionally Accepted” and Archived

Recall that before we submit a paper, the data and analysis scripts will be verififed by the Cornell Statistical Consulting Unit to confirm we get the output in our tables and archived by CISER as indicted in D5.  After that, the paper gets submitted, with each submission recorded, feedback from the reviews carefully processed, and the paper adjusted accordingly until a revise and resubmit (R&R) is received. It is crucial for you to carefully review any reject reviews, as well as R&Rs, as they may contain useful information that could help improve the manuscript.

If the review process for either rejections or R&Rs did not require more data analysis, we’ll post the original data analysis. If reanalysis were needed during the review process we’ll include the new analysis scripts and log files and submit it again for verification with CSCU. 

It’s at this point that we need to make sure that if possible the data is de-identified and that the variable list is easy to follow. With data from grocery stores, restaurants, and hospital cafeterias this will be much more difficult since all of our current agreements specify that data will not be shared. Moving forward, we will change future agreements or that they allow for the data to be disguised so that parts of it can be shared.

Beginning in April 2017, we will make nonproprietary data sets of newly accepted papers available along with analysis scripts on a publically accessed website at For data that is published prior to that and when there is no agreement with the journal, we’ll deal with any requests on a paper-by-paper basis.

Appendix A.

Research Analysis Plan

I. Project Name:

(Ex. GlutenFreeDining-RisleyHall-3-10-17-AB.docx)

II. Lead Researcher(s):

III. Hypotheses and Basic Rationale:

(Example: After annoucing that Risley Dining Hall went gluten-free, health conscious diners rate most foods as tasting better (even salads), but non-health conscious diners rate them as tasting worse.)

IV. Study Design and Sample:  

V. Basic Analyses:

(Example: Key table shells, covariates, subgroups to be analyzed, etc.)

Back to A.

Appendix B.

Supervision of Lab Studies


Food and Brand Lab was founded in 1997 at the University of Illinois at Urbana-Champaign and it moved to Cornell in 2005 with this mission: “We change how food is purchased, prepared, and consumed. Using new tools of behavioral science, we invent healthy eating solutions for consumers, companies, and communities.” The principle mission of the Lab is to impact eating-related changes in a broad community.

Past use of the Lab have largely been self-directed the by researcher who was in charge of study. This may have caused inefficiencies in recruitment and how data was collected and stored. Moving forward, these will be the new standard operating procedures (SOPs) for using the Lab.

[] When conducting a study, a sponsoring professor or post-doc must be present at either the introduction or debriefing of at least 80 percent of a study’s sessions. Studies cannot be conducted alone by graduate students or by research assistants (RAs).

[] Proposed studies must be evaluated by a subset of the Director, Deputy-Director and Lab Manager based on the mission of the Lab (for instance, its potential to provide a healthy eating solution that is scalable) and any risks to the reputation to the Lab or to its operations.

[] Following the study, a detailed procedure and debriefing of the study along with the data will be given to the Director, Lab Manager, and Publications Assistant to archive.

[] A photo of the study set-up. Location, and stimuli will be taken, named consistent to the naming convention, and saved with other files. If people gave consent to be filmed, a brief 1-minute walkthrough of the procedure can filmed on a phone.

Back to B.

Appendix C.

One Example of How to Organize

(Dissertation) Project Files Involving Multiple Studies

  1. Original Idea (Original Hypotheses)
  2. Literature Review
    1. Variable A Literature
    2. Variable B Literature
    3. Theory Literature
  3. IRB
  4. Separate Files for Each Study (e.g., Study 1-6) which contain:
    1. Materials
      1. Papers from which scales/tasks were taken
      2. PDF with Surveys
      3. Specifications of Equipment (if any used)
    2. Data Collection
      1. (Scanned Surveys or Surveys)
      2. mTurk Receipts
      3. Photos
      4. Recordings
    3. Raw Data
      1. Working File
    4. Data analysis
      1. SPSS Output
      2. Tables
  5. Write-Up
  6. Submission Documents

Contact Information (Authors relevant information)

This is schematic of what a folder for a specific study might look like in an online storage system such as BOX

Back to C. 

Step-by-Step Research Guidelines for the Lab

A. Before You Arrive
















B. Planning and Designing Studies











C.  Collecting Data













D. Storing and Analyzing Data














E. Writing and Submitting the Paper















Back to D.