Five Key Issues
- Settings Goals. Goals will influence the nature of data gathering sessions, the data gathering techniques to be used, and the analysis to be performed.
- Identifying Participants. Those who fit the profile of types of people from whom data can be gathered are called the study population. Types of sampling are as follows:
- Probability Sampling: simple random sampling or stratified sampling
- Nonprobability Sampling: convenience sampling or volunteer panels
- Convenience Sampling: sample includes those who were available rather than those specifically selected
- Relationship with Participants: informed consent with a clear and professional relationship between participant and researcher (however, informed consent is generally not required when gathering requirements data for commercial company where a contract usually exist between collector and provider)
Triangulation: the investigation of a phenomenon from at least two different perspectives. This is mostly focused on verification and reliability of data rather than making up for the limitations of another type of methodology
- Triangulation of data: data is drawn from different sources at different times/places/people
- Investigator triangulation: different researchers (observers, interviewers, and so on) have been involved in collecting and interpreting the data
- Triangulation of theories: use of different theoretical frameworks through which to view data
- Methodological triangulation: employ different data gathering techniques
A conversation with a purpose
Good for exploring issues, learning more about tasks, and getting inside user’s head.
- Open-ended/unstructured -> exploratory and similar to conversation. Can be time-consuming but can also produce rich insights
- Semi-structured -> plans basic script with both closed and open questions but probes interviewee until no new relevants info is there
- Structured -> predetermined questions like a questionnaire, study is standardized (same questions with each participant)
- Group interviews -> 3-10 people selected to provide a representative sample of the target population. Useful for investigating shared issues rather than individual experiences
When developing Interview Questions, keep in mind open questions are best suited where the goal of the session is exploratory; closed questions are best suited where the possible answers are known in advance. Break long or compound questions into separate questions
A lot of decisions to make:
- Choosing a framework
- Level of participation to adopt
- How to make a record of the data
- How to gain acceptance into the group being studied
- How to ensure that the study uses difference perspectives
Ethnography: the description of the customs of people and cultures. A distinguishing feature of ethnographic studies compared with other data gathering is that a situation is observed without imposing any a priori structure or framework upon it, and everything is viewed as “strange”.
|Technique||Good for||Kind of Data||Advantages||Disadvantages|
|Interviews||Exploring issues||Mostly qualitative (some quantitative)||Interviewer can guide, encourages contact between researchers and users||Artificial environment might be intimidating, remove them from usual environment|
|Focus Groups||Collecting multiple viewpoints||Mostly qualitative (some quantitative)||Highlight areas of agreement/conflict, encourages contact between researchers and users||Possibility of dominant characters|
|Questionnaires||Answering specific questions||Quantitative and Qualitative||Can reach many people with low resource requirements||Design is key, response rates may be low|
|Direct observation in the field||Understanding context of user activity||Mostly qualitative||Observational insights||Very time-consuming, huge amounts of data|
|Direct observation in a controlled environment||Captural detail of individuals||Quantitative and qualitative||User can focus on task without interruption||Data may be of limited use due to artificial environment|
|Indirect observation||Observing users in natural environment without distraction||Quantitative (logging) and qualitative (diary)||Can be long due to automative recording||Large amounts of data implies need for tools to support analysis, participants may exaggerate memories|
Running the interview
Before starting, make sure that the goals of the interview have been explained to the interviewee and that they are willing to proceed. Listen more than talk, repond with sympathy but without bias, and to appear to enjoy the interview.
- interviewer introduces themselves
- explain why you’re doing the interview
- reassure interviewee re: ethical issues
- ask interviewee if they mind being recorded
- Warm-up session
- easy, nonthreatening questions
- Main session
- questions presented in logical sequence
- probing questions at the end
- order may vary in semi-structured interview
- Cooling-off period
- easy questions to defuse any tension
- Closing session
- interviewer thanks interviewee
- switch off recorder or put notebook away
Users may be observed directly by the investigator as they perform their activities or indirectly through records of the activity that are studied afterward.
Observation can also result in a lot of data to sift through and can be complicated to do well than at firs appreciated. As such, a clearly stated goal is important to have focus for an observation session.
- The person: Who is using the technology at any particular time?
- The place: Where are they using it?
- The thing: What are they doing with it?
3 common approaches
- Simple observation: user is given a task, the evaluator just watches. This gives no insight into users’ decision process
- Think aloud: subjects asked to say what they are thinking/doing. However, its hard to talk while concentrating and thinking may alter the way people naturally perform the task.
- Co-discovery learning: two people work together on a task and normal conversation is monitored.
Degree of Participation
- Passive Observer: observer who adopts an approach at the outsider end; does not take part in the study environment at all
- Participant Observer: adopts an approach at the insider end; becomes a member of the group being studied
A data recording instrument in which a list of itemized coding options are structured
This standardizes observation practices which makes it more objective.
- evaluation goals (break it all down!)
- stage of design
- observation method/types of data
- what would potentially be an interesting finding from this particular style?
- e.g. for think-aloud, it might be good to record action vs spoken comments
Survey vs Questionnaire: the questionnaire is a part of the survey. The questionnaire is just the concrete things you’re asking.
- does not require presence of evaluator
- many results can be quantified
- preparation is “expensive” → need to design questions well
- can have low response rate or low quality response
- difficult to do in-depth “probing”
A questionnaire is good when motivation is high enough without anyone else present. If persuasion is needed, a structured interview is probably better
Designing a Questionnaire
Keep in mind
- what info is sought?
- how would you analyze results?
- what audience do you want to reach?
- what will you do with your analysis?
Don’t use vague questions, pilot the questionnaire before testing.
- should not be transferable to other interfaces, can’t be interpreted in different ways depending on judgment (i.e. domain specific and clear wording)
- avoid double-barreled questions and leading questions
- to de-bias: neutral language, can have random order of questions for different participants
- for validity
- use previously validated questionnaires
- triangulation (ask multiple questions about the same matter)
Types of questions
- Open-ended (hard to analyze rigorously)
- Closed (easily analyzed but can be hard to interpret if not well-designed)
- checkboxes and ranges
- range of answers to demo questions is predictable: offer a predefined list
- interval doesn’t have to be equal in all cases, depends on what you want to know
- mention how many boxes to check, be consistent with ascending/descending order
- Likert and semantic differential scale
- used for measuring opinions, attitudes, and beliefs
- widely used for evaluating user satisfaction
- Likert: a five, seven, or nine-point agreement scale used to measure respondents’ agreement with various statements
- semantic differential scale: rely on choosing pairs of adjectives to explore a range of bipolar attitudes about particular opinions
- ranked (closed)
- respondent places ordering on items in a list
- useful to indicate preferences
- forced choice
- multi-choice (closed)
- offered choice of explicit responses
- checkboxes and ranges
- think about ordering of questions → impact can be influenced by order
- consider if different versions are needed for different populations
- provide clear instructions on how to complete questionnaire
- eg. if answers can be saved and completed later
- think about length → avoid questions that don’t address study goals
- consider allowing respondents to opt out at different stages especially if long → better to have some than none
- think about layout and pacing
Data and Analysis
- Subjective: what you were told what happened
- Objective: what you captured using your senses
- Quantitative: data in the form of numbers or data that can be easily translated into numbers. Focuses on ascertain magnitude, amount, or size of something
- Qualitative: data in the form of words and images. Focuses on nature of something (themes, patterns, and stories)
Note that quantitative data is not always objective! Subjectivity can come from participants in how they express opinions or from investigators during the data capturing/interpreting/analysis process.
Similarly, it is unfair to try to quantize all qualitative data. This needs justification. Also be wary of translating small populatino sizes into percentages.
What to focus on
- What are the most important needs/tasks to support?
- What are the repeated patterns?
- Key issues/areas that could be improved
- What surprised you?
- What is essential/nonessential in implementation
- Initial reactions or observations (identify patterns, simple numerical analysis like averages, ratios, percentages)
- Data cleansing (checking for erroneous entries and anomalies)
Themes are a small number of high-level patterns that answer your evaluation questions.
Going from codes (descriptive labels) to categories (grouping imposed on codes) to themes (interpretive patterns). Deductive analysis is just the inverse (starting at themes and arriving at codes)
Do an initial pass to check of internal consistency: make sure themes occur across several or all participants. Then, step back to see if an overarching narrative emerges from the themes. One can them remove themes or look into why there are conflicts.
One way of doing this is using affinity diagrams:
- record each idea/observation/problem/etc on individual card or post-it notes
- look for notes that seem to be related
- sort notes into groups until all used
- sort and resort as necessary
- scheme of data: code the data according to categories
- if analysis frame is chosen beforehand: deductive analysis
- if study is explanatory and it is important to let themes emerge from data: inductive analysis
- can then analyze with appropriate methods like counting averages # of problems or identifying recurring patterns
Critical Incident Analysis
Helps identify significant subsets of data for more detailed analysis.
This is not about summarizing all incidients, more like finding gold nuggets. Incidents need not be bad all the time, can be either desirable or undesirable.
Risks and Consent
Potential way conclusions can be flawed:
- Construct validity: are we measuring the right thing? Is this clearly connected to our research question? Did we misunderstand the concepts we are working with?
- Internal validity: What are alternative explanations for the results? Other bias, confounding factors, etc.
- External Validity: To what extent are our results and conclusions of our experiment generalizable to our original research question? (how representative are our tasks and users?)
- Empirical Reliability / Reproducibility: Can the study be reproduced?
Risks and Consent:
- In what ways could your participants could be harmed by the study or its results?
- Could be physical harm (less likely in CS), emotional harm (stress, reputation, etc.)
- Evaluate the likelihood of each potential risk (including unlikely cases)
- Are there ways to mitigate these risks? Potentially: adjust your study design
- What would you do if a participant were harmed? e.g. correction, compensation?