In order to develop robot systems that can interact with untrained, na"ıve human users, it is important to understand how people <span class='textend'>a</span>sh from different age groups <span class='textend'>a</span>sh perceive a given robot system and which features might be relevant for this. While existing studies generally use questionnaires/interviews and/or coding schemes rendering either abstract categories or single features of individual behaviour towards a robot, we suggest to use a different methodological approach: to use the concepts and methodological tools from Conversation Analysis (EM/CA). Investigating video data from a study in which users <span class='textend'>a</span>sh here: infants 3 to 8 years old <span class='textend'>a</span>sh play with the toy robot Pleo, we show that and how (1) a user’s perception, categorization and re-interpretation of a robot system emerges step by step during and from the interaction with the system, and (2) how the users’ attempts to establish coordinated ‘sequences of action’ play a central role in this. The results of our exploratory case analysis are discussed in the light of studies suggesting that, in infants, robotic pets seem to blur foundational ontological categories, such as animate vs. inanimate.