Five Key Issues

  1. Settings Goals. Goals will influence the nature of data gathering sessions, the data gathering techniques to be used, and the analysis to be performed.
  2. Identifying Participants. Those who fit the profile of types of people from whom data can be gathered are called the study population. Types of sampling are as follows:
  • Probability Sampling: simple random sampling or stratified sampling
  • Nonprobability Sampling: convenience sampling or volunteer panels
    • Convenience Sampling: sample includes those who were available rather than those specifically selected
  1. Relationship with Participants: informed consent with a clear and professional relationship between participant and researcher (however, informed consent is generally not required when gathering requirements data for commercial company where a contract usually exist between collector and provider)

Triangulation: the investigation of a phenomenon from at least two different perspectives. This is mostly focused on verification and reliability of data rather than making up for the limitations of another type of methodology

  1. Triangulation of data: data is drawn from different sources at different times/places/people
  2. Investigator triangulation: different researchers (observers, interviewers, and so on) have been involved in collecting and interpreting the data
  3. Triangulation of theories: use of different theoretical frameworks through which to view data
  4. Methodological triangulation: employ different data gathering techniques


A conversation with a purpose

Good for exploring issues, learning more about tasks, and getting inside user’s head.

  1. Open-ended/unstructured exploratory and similar to conversation. Can be time-consuming but can also produce rich insights
  2. Semi-structured plans basic script with both closed and open questions but probes interviewee until no new relevants info is there
  3. Structured predetermined questions like a questionnaire, study is standardized (same questions with each participant)
  4. Group interviews 3-10 people selected to provide a representative sample of the target population. Useful for investigating shared issues rather than individual experiences


When developing Interview Questions, keep in mind open questions are best suited where the goal of the session is exploratory; closed questions are best suited where the possible answers are known in advance. Break long or compound questions into separate questions

A lot of decisions to make:

  • Choosing a framework
  • Level of participation to adopt
  • How to make a record of the data
  • How to gain acceptance into the group being studied
  • How to ensure that the study uses difference perspectives

Ethnography: the description of the customs of people and cultures. A distinguishing feature of ethnographic studies compared with other data gathering is that a situation is observed without imposing any a priori structure or framework upon it, and everything is viewed as “strange”.

TechniqueGood forKind of DataAdvantagesDisadvantages
InterviewsExploring issuesMostly qualitative (some quantitative)Interviewer can guide, encourages contact between researchers and usersArtificial environment might be intimidating, remove them from usual environment
Focus GroupsCollecting multiple viewpointsMostly qualitative (some quantitative)Highlight areas of agreement/conflict, encourages contact between researchers and usersPossibility of dominant characters
QuestionnairesAnswering specific questionsQuantitative and QualitativeCan reach many people with low resource requirementsDesign is key, response rates may be low
Direct observation in the fieldUnderstanding context of user activityMostly qualitativeObservational insightsVery time-consuming, huge amounts of data
Direct observation in a controlled environmentCaptural detail of individualsQuantitative and qualitativeUser can focus on task without interruptionData may be of limited use due to artificial environment
Indirect observationObserving users in natural environment without distractionQuantitative (logging) and qualitative (diary)Can be long due to automative recordingLarge amounts of data implies need for tools to support analysis, participants may exaggerate memories

Running the interview

Before starting, make sure that the goals of the interview have been explained to the interviewee and that they are willing to proceed. Listen more than talk, repond with sympathy but without bias, and to appear to enjoy the interview.

  1. Intro
    • interviewer introduces themselves
    • explain why you’re doing the interview
    • reassure interviewee re: ethical issues
    • ask interviewee if they mind being recorded
  2. Warm-up session
    • easy, nonthreatening questions
  3. Main session
    • questions presented in logical sequence
    • probing questions at the end
    • order may vary in semi-structured interview
  4. Cooling-off period
    • easy questions to defuse any tension
  5. Closing session
    • interviewer thanks interviewee
    • switch off recorder or put notebook away


Users may be observed directly by the investigator as they perform their activities or indirectly through records of the activity that are studied afterward.

Observation can also result in a lot of data to sift through and can be complicated to do well than at firs appreciated. As such, a clearly stated goal is important to have focus for an observation session.

Example frameworks:

  • The person: Who is using the technology at any particular time?
  • The place: Where are they using it?
  • The thing: What are they doing with it?

3 common approaches

  1. Simple observation: user is given a task, the evaluator just watches. This gives no insight into users’ decision process
  2. Think aloud: subjects asked to say what they are thinking/doing. However, its hard to talk while concentrating and thinking may alter the way people naturally perform the task.
  3. Co-discovery learning: two people work together on a task and normal conversation is monitored.

Degree of Participation

  1. Passive Observer: observer who adopts an approach at the outsider end; does not take part in the study environment at all
  2. Participant Observer: adopts an approach at the insider end; becomes a member of the group being studied

Coding Sheet

A data recording instrument in which a list of itemized coding options are structured

This standardizes observation practices which makes it more objective.


  • evaluation goals (break it all down!)
  • stage of design
  • observation method/types of data
    • what would potentially be an interesting finding from this particular style?
    • e.g. for think-aloud, it might be good to record action vs spoken comments


Survey vs Questionnaire: the questionnaire is a part of the survey. The questionnaire is just the concrete things you’re asking.


  • cheap
  • does not require presence of evaluator
  • many results can be quantified


  • preparation is “expensive” → need to design questions well
  • can have low response rate or low quality response
  • difficult to do in-depth “probing”

A questionnaire is good when motivation is high enough without anyone else present. If persuasion is needed, a structured interview is probably better

Designing a Questionnaire

Keep in mind

  • what info is sought?
  • how would you analyze results?
  • what audience do you want to reach?
  • what will you do with your analysis?

Don’t use vague questions, pilot the questionnaire before testing.


  • should not be transferable to other interfaces, can’t be interpreted in different ways depending on judgment (i.e. domain specific and clear wording)
  • avoid double-barreled questions and leading questions
  • to de-bias: neutral language, can have random order of questions for different participants
  • for validity
    • use previously validated questionnaires
    • triangulation (ask multiple questions about the same matter)
    • piloting

Types of questions

  • Open-ended (hard to analyze rigorously)
  • Closed (easily analyzed but can be hard to interpret if not well-designed)
    • checkboxes and ranges
      • range of answers to demo questions is predictable: offer a predefined list
      • interval doesn’t have to be equal in all cases, depends on what you want to know
      • mention how many boxes to check, be consistent with ascending/descending order
    • Likert and semantic differential scale
      • used for measuring opinions, attitudes, and beliefs
      • widely used for evaluating user satisfaction
      • Likert: a five, seven, or nine-point agreement scale used to measure respondents’ agreement with various statements
      • semantic differential scale: rely on choosing pairs of adjectives to explore a range of bipolar attitudes about particular opinions
    • ranked (closed)
      • respondent places ordering on items in a list
      • useful to indicate preferences
      • forced choice
    • multi-choice (closed)
      • offered choice of explicit responses


  • think about ordering of questions → impact can be influenced by order
  • consider if different versions are needed for different populations
  • provide clear instructions on how to complete questionnaire
    • eg. if answers can be saved and completed later
  • think about length → avoid questions that don’t address study goals
  • consider allowing respondents to opt out at different stages especially if long → better to have some than none
  • think about layout and pacing

Data and Analysis

  • Subjective: what you were told what happened
  • Objective: what you captured using your senses
  • Quantitative: data in the form of numbers or data that can be easily translated into numbers. Focuses on ascertain magnitude, amount, or size of something
  • Qualitative: data in the form of words and images. Focuses on nature of something (themes, patterns, and stories)

Note that quantitative data is not always objective! Subjectivity can come from participants in how they express opinions or from investigators during the data capturing/interpreting/analysis process.

Similarly, it is unfair to try to quantize all qualitative data. This needs justification. Also be wary of translating small populatino sizes into percentages.

What to focus on

  1. What are the most important needs/tasks to support?
  2. What are the repeated patterns?
  3. Key issues/areas that could be improved
  4. What surprised you?
  5. What is essential/nonessential in implementation


  1. Initial reactions or observations (identify patterns, simple numerical analysis like averages, ratios, percentages)
  2. Data cleansing (checking for erroneous entries and anomalies)
  3. Analysis

Qualitative Analysis

Thematic Analysis

Themes are a small number of high-level patterns that answer your evaluation questions.

Going from codes (descriptive labels) to categories (grouping imposed on codes) to themes (interpretive patterns). Deductive analysis is just the inverse (starting at themes and arriving at codes)

Do an initial pass to check of internal consistency: make sure themes occur across several or all participants. Then, step back to see if an overarching narrative emerges from the themes. One can them remove themes or look into why there are conflicts.

One way of doing this is using affinity diagrams:

  1. record each idea/observation/problem/etc on individual card or post-it notes
  2. look for notes that seem to be related
  3. sort notes into groups until all used
    • sort and resort as necessary

Categorizing Data

  • scheme of data: code the data according to categories
    • if analysis frame is chosen beforehand: deductive analysis
    • if study is explanatory and it is important to let themes emerge from data: inductive analysis
  • can then analyze with appropriate methods like counting averages # of problems or identifying recurring patterns

Critical Incident Analysis

Helps identify significant subsets of data for more detailed analysis.

This is not about summarizing all incidients, more like finding gold nuggets. Incidents need not be bad all the time, can be either desirable or undesirable.

Potential way conclusions can be flawed:

  1. Construct validity: are we measuring the right thing? Is this clearly connected to our research question? Did we misunderstand the concepts we are working with?
  2. Internal validity: What are alternative explanations for the results? Other bias, confounding factors, etc.
  3. External Validity: To what extent are our results and conclusions of our experiment generalizable to our original research question? (how representative are our tasks and users?)
  4. Empirical Reliability / Reproducibility: Can the study be reproduced?

Risks and Consent:

  1. In what ways could your participants could be harmed by the study or its results?
  2. Could be physical harm (less likely in CS), emotional harm (stress, reputation, etc.)
  3. Evaluate the likelihood of each potential risk (including unlikely cases)
  4. Are there ways to mitigate these risks? Potentially: adjust your study design
  5. What would you do if a participant were harmed? e.g. correction, compensation?