The Emergence of the Minimal Self (Research paper)

Action learning or homeostasis at the core of the emergence of the minimal self?


There are two competing frameworks which explain the emergence of the minimal self in humans. One is based on the “interoceptive inference” framework which places homeostatic regulation at the core of the minimal self and could be described as “the self by being”. The other is based on ideomotor and event coding theories and takes the approach of the “self by doing”. Knowing which framework is the most accurate to describe the emergence of the minimal self is important if we wanted to embed a sense of self in artificial agents. To answer this question and pit the two frameworks against each other, we propose an experiment which will test as to whether the homeostatic system of the pupil can learn new goals. The “self by doing” framework predicts that a homeostatic system can learn new goals while the “self by being” assumes that it can have only one goal, the goal of survival.  Our research hypothesis expects that a homeostatic system can learn new goals which means that the “self by doing” framework is correct and offers a more accurate description of the emergence of the minimal self than the “self by being” framework. We also propose an other experiment with the same homeostatic system of the pupil but in dangerous conditions using virtual reality and we speculate that the ability to learn new goals is curtailed when survival is threatened. This would indicate that the minimal self is made of two sub-minimal selves, a “minimal self by doing” in normal conditions and a “minimal self by being” in threatening situations.

Words count Abstract: 263

Keywords: ideomotor theory, theory of event coding, homeostasis, interoception, exteroception, proprioception, sense of agency, sense of body ownership, artificial agents.


This study intends to test whether a homeostatic system has only one goal, the goal of survival, or if new goals can be learned by the system. Answering this research question will help differentiate between two competing frameworks. The first framework is the “Interoceptive Inference” theory of Seth and Tsakisis (2018). The second framework is based on the ideomotor or the self by doing approach of Verschoor and Hommel (2017) and the theory of event coding (TEC) of Hommel (2015). These two frameworks compete in explaining the emergence of the minimal self, and offer a different answer to how many goals a homeostatic system can have.

The concept of Self: historical overview and recent development on minimal self

One of the earliest formulations of the self in modern psychology was made by James (1891); he differentiated between the self as I, the subjective knower, and the self as Me, the object that is known. Since then philosophers and psychologists have expanded this concept with delineations between cognitive, embodied, fictional, social and narrative selves among others (Strawson, 1999). Recent developments, however, classify the various selves into two main groups, the minimal self and the narrative self (Gallagher, 2000). This study is concerned with the minimal self only, it is defined as “a basic, immediate experience of being conscious of oneself as an immediate subject of experience and unextended in time”. It is made of two elements: the sense of agency and the sense of body ownership. The sense of agency is the subjective awareness by the agent to initiate, execute and control its own volitional actions in the world, while the sense of body ownership describes the feeling of “mineness” toward one’s body parts, feelings or thoughts (Gallagher, 2000).

The brain constantly processes information signals about the environment, the body and their interactions and from this processing a sense of minimal self emerges. These signals take three forms: the exteroceptive signals from the exteroceptive system through the sensory cortices, the interoceptive signals from the autonomic system through the anterior insular cortex and the proprioceptive signals from the motor system through the motor cortex (Sterling, 2012).

The framework of Seth and Tsakisis (2018) is mainly concerned with the interoceptive signals while Verschoor and Hommel (2017) and Hommel (2015) focus on the exteroceptive/proprioceptive signals. Although these two frameworks differ on the nature of the signal primarily used by the brain to implement a self, they both describe the emergence of the minimal self as the ability of the brain to make predictions about the environment, the body and their interactions and to check these predictions against actual outcomes.

Understanding how the minimal self emerges in humans has implications on how to embed this capability into artificial agents.

Understanding how this minimal self emerges is gaining importance in cognitive psychology but also in Artificial Intelligence research and robotics as it becomes important to know if these artificial agents could be designed with a sense of self.  This raises important philosophical, moral, ethical and legal questions with far reaching implications. Such questions concern the morality of these artificial agents, their legal rights and if they can be held legally accountable and responsible as human beings are. There are also opportunities as these artificial agents could help solve intractable problems that have eluded humans’ intelligence until now (Clausen, 2015) and enabling them with a minimal sense of self would allow these agents to be better suited to live and interact with human beings and make them more relevant to our day-to-day life (Hafner, Loviken, Pico, Villalpando & Schillaci, 2020). In the near future, our societies will decide what to do with these questions. Understanding how the sense of self emerges in humans will enable our societies to seek or prevent artificial agents to have it. Differentiating between the two frameworks will lead to different approaches on how to embed the minimal self in artificial agents.

Framework of ideomotor and event coding theories

The sense of agency, defined as the sense of being the one who generates an action, is not always necessarily linked to movement. It is primarily through movements, though, that the body “acts” upon the environment. This relationship has been researched by Gallese and Ferri (2014) and they have shown that the ability for a human to control its movements leads to a “self” versus “other” distinction . Similarly, the “central monitoring theory” of Blakemore, Wolpert and Frith (2002) has shown that we recognise ourselves as agents having a sense of agency on the basis of the congruence between self-generated movements and their expected consequences. This approach sees the role of action as a key determinant for self-identification and it offers as an explanation that every time the motor centers send an outgoing signal for producing movement, a copy of this signal is “retained” and then compared with the returning input signals resulting from the movement. In the case of self-generated movement there is a perfect match between the prediction and the actual movement itself. When there is no match between the expectation and the actual movement, the brain categorises this movement as an “external event”. Evidence of this theory has been provided for example with patients suffering from lesions in the motor-cortex which resulted in abnormalities in their awareness of action and the sense of self (Berti, Bottini, Gandola, Smania & Stracciari, 2005). This theory can be seen as an evaluation between a predicted action and its actual outcome.

But for this theory to fully account for the sense of agency, it must also explain how the agent develops the ability to intentionally select actions. This is described as the action “identification” phase. Here, the ideomotor theory provides an explanation for this phenomenon: the agent develops over time the ability to select an “action” through the active interaction of itself with the social and physical environment (Verschoor & Hommel, 2017). The ideomotor theory offers an explanation as to how this knowledge is gained, how prior-knowledge of action-effect relationships is created. The theory suggests that knowledge about the action-effect relationships is gained overtime by unintentional movements produced during development. As the child gains more knowledge he/she can choose the right action for the right situation. With the progressive mastery of the motor skills, the child learns to direct his/her movements and interact with the environment through thought. So the development of the sense of agency is explained by the following sequence of events: the acquisition of knowledge about bidirectional action-effect associations, the intentional identification of the proper action, the ability to carry-out these actions through the control of motor movement and finally the ability to “evaluate” the matching between the predicted and actual effect of an action (Verschoor & Hommel, 2017).

The ability to develop a library of action effect associations is further explained by TEC (Hommel, 2015) which expands the ideomotor principle of general knowledge acquisition by showing how events are coded together bidirectionally. The “pairing” associations between features of events, which can be actions or perceptions, is further strengthened by repeated occurrence. This theory claims that perceptions and actions are directly linked by a common computational code and that the trigger of one of the features initiates the activation of the others. It also suggests how humans learn to differentiate themselves from other humans. Theoretically, TEC does not differentiate how someone does represent oneself from someone else when an action-effect-pairing is presented. When, for example, an infant sees her mother move the mobile above the crib, she might not know that it is her mother moving the mobile. Overtime, however, the infant learns that it is her mother and not herself interacting with the mobile, as her own actions are easier to predict than the ones of her mother. It is also the case that if the infant was the one moving the mobile she would receive “enhanced” signals from her interoceptive channels (Hommel, 2005).   

In that view, the minimal self is constructed overtime as the sense of agency comes with the ability and mastery to perform goal-directed actions. Actions are matched, or not, to the expected expectations the infant has. This ability does not appear before the age of nine months according to Verschoor and Hommel (2017 ), so it is a learned ability.

Nevertheless, this sense of agency is one of the elements which constitutes the minimal self (Gallager, 2000; De Haan & De Bruin, 2010) and even if ultimately the sense of self might require brain processes and a biological body (Ferré, Lopez & Haggard, 2014; Damasio, 1999; Searle, 1997) , ideomotor and TEC theories are compatible with the view that artificial agents could develop a minimal sense of self. What it would take, is for these systems to have the ability to build bidirectional action-effect associations and acquire knowledge about their environment and “themselves” through the principles of event coding theory (Herbort & Butz, 2012). From these associations, like for human beings, a sense of goal-directedness could emerge as envisaged by Aleksander (2007) when he considered how a machine could develop a “sense of self in a perceptual world” and a sense of “self” versus “other” distinction without being tied up in a biological body. To do so requires to give artificial agents the ability to develop bi-directional associations, select their actions intentionally and being able to assess the outcome of these actions.

The framework of “interoceptive inference” theory

In the last ten years, psychologists and neuroscientists have studied the interactions between the brain and the body, and more specifically interoception, “the sense of self from within” either in healthy individuals or in patients suffering from illness. For example, LeDoux (1998) challenged the prevailing view that the brain was an information-processing machine that could be described and understood independently from the rest of the body, as if cognition could be somehow disembodied.

Pioneers in this field like Damasio (1994) explored the field of affective neuroscience and the importance of emotions in cognition. By studying neurological and psychiatric disorders where the embodied sense of self is disrupted such as schizophrenia, autism, attention disorders and emotional-processing disorders, interoception took central stage in the empirical study of the minimal self (Damasio, 2003).

Interoception is defined as the perception and integration of all signals from within our body, whether we attend to them or not (Sherrington, 1906). Originally, it was the visceroceptive information from the autonomic system. It has since then been extensively studied and an expanded version of this concept has emerged in which these perceptions include autonomic, hormonal, visceral and immunological functions: breathing, blood pressure, cardiac signals, temperature, digestion and elimination, thirst and hunger, sexual arousal, affective touch, itches, pleasure and pain (Cameron, 2002; Khalsa & Lapidus, 2016). The various signals we perceive account for the fact that we experience our “self” as being “inside” a body, mainly through interoception and as a body which moves and interacts with the environment through proprioception and exteroception (Hommel, 2005). In Seth and Tsakisis (2018), interoception is defined as visceral signals, low-level monitoring of blood chemistry, and affective touch or pain. They also add “interosensations” and “interoactions” about the body states and this makes it difficult to assess exactly if interoception includes proprioception.

Interoception nevertheless lies at the core of our physiological body and is essential for cognition. This concept of embodied cognition is not new and is also present in TEC (Hommel, 2015) and ideomotor theory (Verschoor & Hommel, 2017). In these theories, human cognition has deep roots in sensorimotor processing and is also not seen as a centralised, abstract, and sharply distinct entity from peripheral input and output modules. In the ideomotor theory, however, the proprioceptive and exteroceptive signals have more emphasis than the interoceptive signals for the development of the action-learning capability. In that sense, this framework is conducive to the concept of embodied cognition, bodily signals are required for cognition to take place. For the “interoceptive inference” framework, however, the interoceptive signal is the principal signal at the core of embodied cognition.

These interoceptive signals are processed in the same brain regions as the ones processing the homeostatic regulations as postulated in Damasio’s somatic marker hypothesis (Damasio, Everitt & Bishop, 1996). These same brain areas are also involved in decision making processes and when these areas are damaged in patients, they can no longer process their “interoceptive” signals and their ability to make decisions is impaired. So homeostatic regulation, interoceptive signals and decision making happen to be processed by the same brain areas, linking them in some ways. The ability to make decisions contributes to the sense of agency in an agent and the linkage between interoceptive signals and decision making ability would indicate that interoceptive signals play a role directly or indirectly in the sense of agency.

In this “interoceptive” framework, Grivaz, Blanke and Serino (2017) identify the insula in particular in the formation and maintenance of the minimal self. They describe the phenomena as the body sending signals to the brain and vice versa, in a constant feedback loop that involves the Autonomic Nervous System (ANS) acting in response to external inputs and interoceptive states, enabling and disabling the body’s various states of arousal and fight or flight responses. In this way, ANS ensures constant adjustments in order to maintain the overarching goal of survival. Furthermore, as shown in Seth and Tsakiris (2018), the more people are connected with their interoceptive regulation, their ability to monitor their own internal states, the less prone they are to be “fooled” by  the self-illusion induced by the Rubber Hand Experiment, which leads these two researchers to claim that “interoceptive processing acts to stabilise the model of our self”. This is how the emergence of the self is explained, by feeling our embodied selves from within, we develop the ability to form a boundary between self and the “others” as well as the environment. The interoceptive signal becomes the “reference” signal, which sustains a constant sense of selfhood in dynamic relation to and distinction from others (Fotopoulou & Tsakiris, 2017). In this view, the brain is seen as a “statistical organ” that makes predictions about sensory information on the basis of previous instances. So, an infant will build models overtime to explain the possible causes of its sensory state in the external world, and the brain will predict the probability of an interoceptive signal as a result of an input from the environment. The actions that the child will undertake in response to the interoceptive signals serve to reduce the prediction errors with regards to the expectation of the environmental inputs. This physiological homeostatic reaction turns into the “psychological feelings” they call “mentalisation” and this is what forms the core of the infant’s minimal self. The stability of the “perceived” self is ensured by the constant engagement by the brain of the allostatic prediction to ensure our survival in a complex and ever changing dynamic environment. Allostasis is defined in Seth and Tsakiris (2018) as a “form of regulation that emphasises the process of achieving stability through change”.

So, in the context of equipping artificial agents with a minimal sense of self, an equivalent to an interoceptive signal would be required. This is exactly what is suggested by Kingson and Damasio (2019): embedding into artificial agents a process resembling homeostasis and equipping them with a robotics body that must be maintained within a narrow range of viability states. By building “vulnerability” into the robotic body of the artificial agent, it would need to develop goals to ensure its survival, which in turn would form the basis for a sense of agency. This robotic body would be made of soft “tissues” embedded with electronics, sensors and actuators as well as a computational system which allows cross-modal associations. With these two elements, the machine would represent its “internal” state, and this would be an equivalent of the interoceptive system in humans. With an interoceptive signal in place, the minimal self could emerge following the same principles as described above in the “interoceptive inference” framework.

Research proposal to pit the two frameworks against each other

This study intends to test the key assumption made by Seth and Tsakiris (2018) that a homeostatic system can only have one goal. This is in contradiction with the ideomotor and TEC framework which predicts that a homeostatic system can learn new goals. This prediction comes from the action-effect bidirectional association principle.  In this principle, the action of dilating or contracting the pupil is associated with tones as effects. Following the acquisition of this association, the study tests whether the tone (the effect) in return evokes the action

So the research question is to answer if a homeostatic system can learn new goals, and the research hypothesis is that homeostatic systems can learn new goals above and beyond the goal of survival.

To test this hypothesis, we choose the homeostatic system of the pupil as it is well researched and understood (La Morgia, Carelli, & Carbonelli, 2018). The study consists of 3 phases: calibration phase, acquisition phase and testing phase. The calibration phase will establish an average individual pupil size when the pupils dilate, constrict and at rest.  During the acquisition phase we will induce the action (pupil dilation or constriction) by displaying a bright or dark screen and afterwards, present the action’s effect (low or high pitch sound).  In the testing phase, we will test whether an action effect association was formed during the acquisition phase by firstly, presenting the effect (low or high pitch sound) and thereafter, observing whether the effect triggers the action (pupil dilation or constriction), while the screen has neutral brightness. The dark, bright and neutral screen of the acquisition trials are used in order to induce the action (pupil dilation/constriction). Our hypothesis is that we will observe that effects (low or high pitch sound) induce the action (pupil dilation or constriction) they are associated with.

Figure 1. Experimental procedure

If as this study intends to show, a homeostatic system can have more than one goal, it would question the validity of the “interoceptive inference” framework in explaining the emergence of the minimal self and reinforce the competing explanation offered by the ideomotor/TEC framework.

Method section

Subjects The experiment will require 55 participants and for the justification of this sample size, refer to the Power Analysis section. They will be recruited through an advertisement campaign with the title “Eye tracking: test your visual attention!”, attracting undergraduate students from Leiden University. Participants should be between 18 and 35 years of age and free of current or previous psychiatric or neurological diagnosis. Additionally, they should have normal or corrected-to-normal vision and hearing. The experiment will last around 60 minutes and participants will receive financial compensation (6,50 €) or credits (2 EC) according to the normal hourly rate for participating in a study.

Test Environment & Apparatus

The participants will be seated in a room and place their heads on a chin rest facing a computer monitor screen. A 24-inch TFT-screen (HP Elite Display E242, 1920 x 1200 pixels, 16:10), equipped with a Tobii X3 eye-tracker will be used for visual. Auditory data will be presented through headphones. The Tobii external processing unit records gaze data at 120 Hz and pupillary data at 40 Hz. The Tobii X3 has an average accuracy of 0.4° and allows for a certain amount of head movements by the subjects (50x40x40cm). Stimulus presentation will be controlled by a PC running E-Prime® software. The distance between eyes and apparatus will be approximately 65 centimeters (the screen’s viewing angle will be 43.5° by 28.0°). Further, in order to guarantee consistent pupil measurement lighting in the room will be constant in all conditions and for all participants.


Upon arrival, the participant will be provided with the information letter and the informed consent. After signing the informed consent, the participant will be guided to the room with the eye tracker. Here, the participant is seated in front a TFT-screen, with about 65 centimeters distance between the participant’s eyes and the screen. The participant is notified about the eye tracker that measures where the participant fixates. Additionally, the participants will be given instructions by the experimenter before the start of the experiment. They will be asked to keep their focus on the fixation cross in the center of the screen and move as little as possible. They will be told that the brightness setting of the screen will change. They will be notified about the different tones that will be presented and that they can ignore them. Additionally, they will be told that occasionally, the color of the fixation cross will change. Their task will be to respond to those changes by pressing the space bar (see Dummy Task section). After giving the instructions, the experimenter checks if the participant has understood them correctly. Additional instructions about the task itself will be presented on the monitor. After the participant reads these thoroughly, the experimenter checks if the participant has understood them correctly. The participant is equipped with headphones.

As previously mentioned, the experiment will have three different phases: the calibration phase, the acquisition phase and the test phase. All participants will go through these phases in the same order. In addition, the participants will focus on a fixation cross in the middle of the screen while the eye-tracker measures pupil size.

Dummy Task

During the calibration, acquisition and testing phase the participants will perform a dummy task in which they are instructed to press the spacebar as quickly as possible if the fixation cross changes (iso-luminant) color. This will occur in 20% of the trials. This dummy task ensures a stable degree of attention on the task. If the participant does not focus properly on the fixation cross, a pop-up message appears on the monitor, requesting the participant to focus on the fixation cross. Thus, the dummy task requires gaze contingent eye tracking.

Calibration Phase

The calibration phase (see Figure 2) consists of 10 fixation trials for each luminance condition (dark, bright and neutral), from which we will calculate the average pupil size (in dilation, constriction and at resting-state) for each condition. For this phase the participants will focus on the fixation cross (0.4º) in the center of the screen. This phase begins with the screen in a neutral brightness setting during which the pupil size will be measured for 500ms after the initial pupil reflex to the brightness setting. This trial is followed by a dark or light screen consisting of the same pupil measurement procedure. Dark and light screen brightness settings will be presented randomly with a neutral screen brightness setting in between these conditions. This phase will give us the necessary change in pupil dilation and constriction, in order for it to elicit an effect when presented with a sound. To obtain this value, we will compute the mean pupil size of the neutral screen condition (X) and subtract it by the dark screen condition’s pupil size average (Ydark). Afterwards, we will multiply the result of this subtraction by .70 and add X to the result of this multiplication. This the threshold of pupil dilation that induces the tone that is triggered by pupil dilation (Udark). In other terms: ((X – Ydark) * .70) + X = Udark. The same applies for the bright screen condition: ((X – Ybright) * .70) + X = Ubright. In this case, Ubright is the threshold of pupil constriction that induces the tone that is triggered by pupil constriction.

Figure 2. Calibration Phase. The depicted figure displays the calibration phase. Participants will be exposed to dark (T2), bright (T4) and neutral (T5) screen settings in a random order, during which their pupil size will be measured. We expect pupil dilation (T2) when there is exposure to a dark setting and pupil constriction (T4) when bright screen settings are presented, as displayed in the figure. In between trials a neutral screen will be presented. From the values obtained during this phase, we will establish a baseline value for the contracted, dilated and resting-state pupil size.

Acquisition Phase

The acquisition phase (see Figure 3) consists of a total of 100 trials, in which an association between action (pupil dilation/constriction) and effect (high/low pitch tone) is established. Each trial starts in a neutral brightness setting. Once the eye tracker registers adequate focus on the fixation cross, the brightness of the monitor will change to either dark or light, causing pupil dilation or constriction (action). Once the eye tracker registers the value obtained for U (see Calibration Phase section), which is previously calculated in the acquisition phase, a tone will be presented for 200ms to the participant. We will present both high pitch (800Hz) and low pitch (300 Hz) tones. Both trials (dark/ bright luminance) will be presented 50 times in a randomized order throughout this phase. The brightness setting will turn back to neutral, 5 seconds after a tone is presented to the participant. 

Figure 3. Acquisition Phase. This figure shows a possible series of trials of the acquisition phase. A trial begins with a neutral brightness setting (T1). Once adequate focus is registered, the brightness setting will change to either dark (T2.1) or bright (T2). When the Ubright/dark value (obtained in the calibration phase) is reached, the corresponding pitch tone (high or low) will be heard. 5 seconds after the screen will return to neutral brightness setting (T3).

Testing phase

The testing phase (see Figure 4) contains a total of 50 trials. During this phase the monitor stays in the neutral brightness setting. The participant focusses on the fixation cross in the center of the screen, and after one second of adequate fixation is registered, the effect (high pitch or low pitch tone) is presented. These tones will be identical to the tones that were used in the acquisition phase. After the effect is presented, the eye tracker will measure the pupil adjustment of the participant. According to the ideomotor theory, we predict that effects that were previously caused by actions (pupil dilation) will result in that same action (pupil dilation), whereas effects (tones) that were caused by other actions (pupil constrictions) will result in those actions (pupil constrictions). After both tones are presented 25 times in a randomized order, the testing phase and the experiment are concluded.

Figure 4. Testing Phase. During the testing phase we will test whether action-effect associations were acquired during the acquisition phase. The screen remains in the neutral brightness setting for this phase. After the participant focuses properly on the fixation cross for 1 second (T1), a tone (previously heard during the acquisition phase) will be heard (T2). The eye-tracker will register pupil size after the tone is presented. We expect that effects (tones) that were previously heard after pupil dilation (during the acquisition phase) will induce that action (pupil dilation) and effects that were heard after pupil constriction will induce the corresponding action (pupil constriction), as displayed in both figures.

Statistical Analysis & Data Handling

Pupillary data will be recorded during the acquisition, calibration, and testing phase. For the pupillary analysis we export the 40 Hz raw pupillary data per eye from Tobii Studio ®. We then apply a number of processing steps using R. Firstly, we apply an outlier rejection based on minimum and maximum pupil size: min=1mm, max=6mm (Verschoor, Spapé, Biro, & Hommel, 2013). Then we apply an outlier rejection based on the maximum allowed change in pupil size in 25ms, defined as 0.5mm in 25ms (Verschoor et al., 2013). Afterwards, we interpolate both the left and the right eye according to Hepach, Vaish, & Tomasello. (2012). Then data from left and right eye are combined by averaging them into one value. If only one of those is present, then the present data point is reported. Then the combined data are interpolated once more using the Hepach procedure (Hepach et al., 2012). Then, we calculate the averages for constriction’s and dilation’s tone pupils.

Once we have computed our data we proceed to the statistical analysis. The described experiment is a 2 x 1 repeated measures within-subject design. The independent variable is the high or low pitch tone (effect) the dependent variable is pupil dilation (action). Therefore, we will use a paired t-test with pupil dilation/constriction as a within-subject factor. We will test if effects that were previously caused by pupil dilations (in the acquisition phase), will result that same action (bigger pupil size) than effects that were caused by other actions (pupil constriction).

In order to perform a paired t-test analysis, we are required to check whether all the assumptions are met beforehand (Howell, 2017). Firstly, the dependent variable, pupil dilation/ constriction, should be measured at an interval or ratio level. Secondly, the observations are independent of one another. Thirdly, the distribution of the dependent variable amongst the groups should be approximately normally distributed. One way to check this is by using the Shapiro-Wilk test of normality on SPSS. However, the paired t-test is reasonably robust to violations of this assumption, therefore a. large sample (>20) should prevent a negative influence of this violation on the results. The final assumption refers to outliers. There should be no significant outliers in order to perform a paired t-test. This can be verified by taking a closer look at the obtained data and detecting whether there are any values deviating more than three standard deviations from the means (Winn et. al, 2018).

Power Analysis

Why Power Analysis and why G*Power?

In order to assess the power of the statistical test/s proposed in this paper, we performed a power analysis using the G*power software. The power of a statistical test refers to its probability of yielding statistically significant results (Cohen, 1988), which we aim to do with the proposed experiment. The G*power program was selected for this because it facilitates the implementation of several types of power analyses and is easily accessible (Mayr, Erdfelder, Buchner, & Faul, 2007).

G*power is a tool, which facilitates statistical power analyses for different statistical tests, among which are paired samples t-tests (Faul, Erdfelder, Lang, & Buchner, 2007). Furthermore, G*power allows for predictions of effect sizes and graphical presentations, which is why the program was chosen for the present analysis.

Statistical power is a function of effect size in the population, sample size n, or number of observations, and the level of significance α (Sedlmeier & Gigerenzer, 1989). The effect size reflects the variation between the null hypothesis, H0, and the alternative hypothesis, H1. The null hypothesis in the proposed experiment is that there is no action-effect association in the homeostatic system of the pupils, and that they will not perform the action (contract or dilate) in reaction to the effect (the tone). The alternative hypothesis is that the pupillary system does operate based on action-effect associations and therefore, the pupils will perform actions (dilate or contract) in response to the effects (the tones) that were previously associated with the actions. Assuming everything else is constant, the greater the effect size, the greater the difference between H0 and H1, the greater the statistical power of the test (Sedlmeier & Gigerenzer, 1989).

Alpha (a) reflects the frequency of H0 being rejected when in fact H0 is true. Falsely rejecting a true null hypothesis is called Type I error, or α error. The proposed study aims for a significance level of α = 0.05, this value was also used in the power analysis. The lower the significance level α, the lower the power with everything else being constant (Sedlmeier & Gigerenzer, 1989). N refers to the number of observations, and the larger it is, the higher is the power of the statistical test.

Type of Power Analysis

The performed type of power analysis using the G*power program 3.1 is an a priori analysis. An a priori analysis is performed before a study takes place and calculates the required sample size given the desired α, the desired power, and the effect size. It is of high use to the present study because it can estimate both the type-1 error probability α, where H0 is falsely rejected, and the type-2 error probability ß, where H0 is retained when it is in fact false. Furthermore, an a priori analysis also estimates the power of the test, since that is the complement of the type-2 error probability, 1- ß, and refers to correctly rejecting H0 when it is in fact false (Mayr, Erdfelder, Buchner, & Faul, 2007).

In the present experiment, there will be two measures on each subject which are measured in the test phase (e.g. pupil dilation in reaction to dark screen compared to pupil reaction to tone associated with dilation). We test if tones that were previously (during acquisition) caused by pupil dilations result in bigger pupils than tones that were previously caused by pupil contractions.


In order to use the G*power software, we first select the test and in our present study it is an a-priori paired t-test and then we need to make assumptions on the following parameters:

  • Effect size f: a range of value between 0.11 and 0.46 is chosen based on the pupillometry study of Verschoor et al. (2013) which presents an experimental design comparable to the one of the present study. In their stimulus locked analysis, the repeated-measures ANOVA on pupil dilation shows a small effect size of 0.11 for the group of infant. Of particular interest, the group of 7-months old infant showed a medium size effect of 0.46 for the repeated-measures ANOVA on pupil dilation. A repeated-measures ANOVA was conducted and showed that participants showed larger relative dilations in incongruent trials with an effect size of 0.11, a small effect size. For the 7-month old age group however, the repeated-measure ANOVA showed a larger effect size of 0.46. What distinguishes the 7-month old subgroup from the overall group was their lack of explicit response times which makes it comparable to the present study design. Another study has a similar design where pupil dilation is measured in the context of congruent and incongruent tones in terms of participant’s expectation (Marois, Labonté, Parent, & Vachon, 2018). In this study, effect sizes also ranged for repeated-measure ANOVA on dilation of pupil between 0.13 and 0.45.
  • α: we choose a significance level criterion of α = 0.05 as it is at this level that most psychological studies are conducted with (Howell, 2002)
  • Power 1- ß. A generally accepted minimum level of power is 0.80 (Cohen, 1998). This value is based on the idea that with a significance criterion of 0.05 the ratio of a Type 2 error (1-power) to a Type 1 error is a factor of 4, so concluding there is an effect when there is no effect in the population is considered four times as serious as concluding there is no effect when there is an effect in the population. For this study, we will be more conservative and seek a Power 1- ß = 0.9 as it would give more credentials to the power of the effect that we are seeking to test in this study.
  • Number of Groups: 1 as we have two balanced groups, but we do a within-subject analysis only and no in-between groups analysis. 
  • Number of measurements: 2, measurement 1: pupil size in reaction to tone that was associated with dark lighting condition (pupil dilation); measurement 2: pupil size in reaction to tone that was associated with bright lighting condition (pupil contraction)
  • We have a one-tailed design, because the critical area of the distribution in the present experiment is one-sided. Namely, we test if tones that were previously (during acquisition) caused by pupil dilations result in bigger pupils than tones that were caused previously by pupil contractions.
  • The power analysis will also estimate a critical t-value used for significance testing. If this value is exceeded by the test statistic, the null hypothesis will be rejected.


Based on these assumptions, we calculated a range of suitable sample sizes for the experiment, which are presented below.

Figure 5. Power as a function of total sample size when effect size dz = 0.11.

Figure 6. t-value as a function of the probability p. This figure shows a graphical representation of the test, with the sampling distribution (the blue line), the population distribution (the red line), the probability of a type 1 error (the red shaded area), the type 2 error (the blue shaded area), and a vertical line marking the critical t-value.

For an effect size, dz = 0.11, for ß= 0.9, we find a sample size of 710. The critical t-value is 1.647.

Figure 7. Power as a function of total sample size when effect size dz = 0.46.

Figure 8. t-value as a function of the probability p. This figure shows a graphical representation of the test, with the sampling distribution (the blue line), the population distribution (the red line), the probability of a type 1 error (the red shaded area), the type 2 error (the blue shaded area), and a vertical line marking the critical t-value.

For an effect size, dz = 0.46, for ß= 0.9, we find a sample size of 42. The critical t-value is 1.6829.

These results show that the number of participants in the study is very sensitive to the effect size. For the present experiment, we will expect a medium effect size dz = 0.46 (Cohen, 2008) considering the two pupillometry studies discussed previously. The experimental set-up of the present study is close to the 7-months old infant group in Verschoor et al. (2013) who do not show an explicit response time since there are no other motor actions in our design apart from the motor contracting and dilating the pupil. Accordingly, in addition to the theoretical calculation of effect size of 42, we will add 5 more participants to conduct a pilot for optimizing the experiment and an additional 8 participants to account for participants who might not finish the experiments or for potential technical problems which might make results unusable. In conclusion, a total of 55 participants will be recruited and the study shall be funded accordingly to this number of participants

It is to be noted, however, that it is important to consider that all the power calculations have been done under the assumption that the data are normally distributed. If this was not the case, a larger sample would be needed but the risk is limited with a sample size of 55.


Two frameworks, the ideomotor/TEC theories (Verschoor & Hommel, 2017; Hommel, 2015) and the “interoceptive inference” (Seth & Tsakiris, 2018) are compared. They differ on how they account for the emergence of the minimal self. The ideomotor/TEC framework relies on action-effect learning. It is a progressive process where the infant learns to intentionally identify an action, carry out its execution and compare its outcome with its expected effects. With time, following this sequence of events, a sense of agency emerges within the infant. This is mainly done through proprioceptive and exteroceptive signals.

In the “interoceptive inference” framework, the interoceptive signal, a direct result of homeostatic processes, is the signal which gives the infant the sense of body ownership. The tight range of regulation required to maintain the survival of the homeostatic system also confers the “stability” to the sense of self.

In both frameworks, it is theoretically possible for artificial agents to be “equipped” with the proper algorithms and robotic bodies from which a sense of a self could emerge (Aleksander, 2007; Kingson & Damasio, 2019).

These two frameworks base their explanation on the emergence of the self on the ability of the brain to match the predictions it makes on the body and the environment with the actual outcome. What differentiates them is the nature of the signal used for the brain to build a database of action-effect associations and which signal is used to provide the feedback loop between the intended action and the actual outcome. If the ideomotor/TEC provides a compelling explanation for the emergence of the sense of agency using the exteroceptive and proprioceptive signals as the main signal, the “interoceptive inference” framework offers a convincing description of the emergence of the sense of body ownership using the interoceptive signal. The delineation between the different signals is not always made explicit. In Seth and Tsakiris (2018) the interoceptive signal overlaps with the proprioceptive and exteroceptive signals which makes it more difficult to differentiate their explanation from the ideomotor theory for example. Furthermore, if proprioception is strictly speaking distinct from interoception, the two are not opposed but rather functionally and anatomically connected and the delineation between the two is more a matter of definition than an actual reality (Gallese & Ferri, 2014).

Nevertheless, independently of the nature of the signals which opposes these two frameworks, both models rely on assumptions.  The assumption that a homeostatic system has only one goal, the goal of survival is key for Seth and Tsakiris (2018). This same assumption, however, is contradicted by ideomotor/TEC framework which predicts that a homeostatic system could have more than one goal.

This study is therefore pitting these two framework against each other by testing the homeostatic system of the pupil to conclude if it can have more than one goal.

Should the experiment confirm this hypothesis, it would question the validity of the “Interoceptive inference” framework and reinforce the validity of the ideomotor/TEC framework to describe the emergence of the minimal self.

  It would, then, be interesting to further study this effect in more adverse conditions than the conditions of the present experiment. It could be that a homeostatic system can learn new goals when its “physical integrity” is not at risk, but that this ability would be constrained or restricted in adverse conditions when the imperative of the system to preserve its integrity becomes paramount. The conditions of the proposed experiment in this study do not pose any risks to the integrity of the pupil of the participants and so it could be that within the tight range of viability allowed for this homeostatic system, there is the ability to learn new goals, but that under more severe conditions closer to the boundary conditions of the tight range of regulation, the homeostatic system would override that learning ability by the imperative to preserve its physical integrity. To ensure that such an experiment could be conducted on human beings, a virtual reality environment would be needed to simulate a threat to the homeostatic system so the effect of dangerous conditions could be assessed without putting the study participants at risk.

Research has shown that pupil dilation changes when an agent is preparing to escape or avoid a threat (Sege, Bradley & Lang, 2020) with pupil size increasing during the active coping period relative to a threat. This new experiment would use a haptic feedback suit (Foxman, 2018) to immerse the participant into a virtual reality world in which he/she would see his/her body. A threat like a dangerous crawling spider would then be shown in the simulation. The participant would wear eye tracking glasses which would both be able to measure pupil dilation/contraction but also show the participant the virtual environment participant. A product such as SMI-ETG-2w eye-tracking glasses with 60 Hz sampling rate (Gaze Intelligence) could be used in combination with a Tesla Haptic VR suit (Tesla Suit). It would be expected that the pupil in the active phase of preparation would dilate when the brain would start detecting the threat. In the experiment during the preparation phase, the participant would undergo a training where he/she associates a sound with the action of pupil contraction. In the testing experiment, the threat of the crawling spider would be shown in a controlled luminescent condition. This controlled  luminescent condition would normally elicit no contraction nor dilation of the pupil. The sound, however, should normally trigger the contraction of the pupil as per the ideomotor principle while the danger of the spider would elicit a dilation of the pupil. The homeostatic system of the pupil would therefore be conflicted with both dilation and contraction. The study would measure what happens in this situation of conflicting responses and assess how the homeostatic system behaves. The prediction is that under threat the homeostatic system would not be allowed to learn new goals. This could imply that the minimal self is made of two components. A “homeostatic” minimal self which is ever present and can be observed in severe and threatening conditions and a minimal self “by doing” which can be observed in non-threatening conditions where new goals can be learned by the homeostatic systems. The question of the relationship between these two “minimal” self remains unclear and would warrant further research to be clarified. A consequence however of the dual minimal self would be that children who are raised in very adverse situations might not be able to learn new goals with their homeostatic systems and would be forced to rely mainly on the “self by being” to survive; this could hamper their ability to develop fully and normally like other children. An other consequence of the dual minimal self could be that for chronically anxious people, their abilities to learn new goals through the “self by doing” mechanism could be curtailed since their anxiety would keep them in survival mode and prevent them to benefit from the flexibility for change that offers the “self by doing”. Further research would be required to elucidate these questions.


Aleksander, I. (2007). Machine consciousness. In M. Velmans, & S. Scheiner (Eds.), The Blackwell Companion to Consciousness (pp. 87-98). New York: Oxford University Press

Berti, A., Bottini, G., Gandola, M., Pia, L., Smania, N., & Stracciari, A. (2005). Shared cortical anatomy for motor awareness and motor control. Science309, 488-491

Blakemore S. J., Wolpert D. M., & Frith C. D. (2002): Abnormalities in the awareness of action. Trends Cogn Sci, 6, 237–242 

Cacioppo, T. J., Tassinary, G. L., & Bemston, G. (2007). Handbook of Psychophysiology.  Cambridge University Press

Cameron, G. O. (2002). Visceral sensory neuroscience: interception. Oxford UniversityPress

Clausen, J. (2015). Clinical Brain-Machine-Interfaces: Ethical Legal and Social Implications. International Workshop on Clinical Brain-Machine Interfaces (CBMI2015), Tokyo, Japan

Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, New Jersey: Lawrence Erlbaum Associates

Damasio, A. (1994). Descartes' Error: Emotion, Reason, and the Human Brain. Putnam Publishing

Damasio, A., Everitt, B. J., & Bishop, D. (1996). The Somatic Marker Hypothesis and the Possible Functions of the Prefrontal Cortex. Philosophical Transactions: Biological Sciences, 351(1346), 1413-1420

Damasio, A. (1999) The Feeling of What Happens: Body, Emotion and the Making of Consciousness. London, Heinemann

Damasio, A. (2003). Feelings of emotion and the self. Annals of the New York Academy of Sciences, 1001, 253–261

De Haan, S., & De Bruin, L. (2010). Reconstructing the minimal self, or how to make sense of agency and ownership. Phenomenal Cognitive Science, 9, 373-396

Dunkel, C. S. (2005). The relation between self-continuity and measures of identity. Identity, 5(1), 21–34

Elsner, B., & Hommel, B. (2001). Effect anticipation and action control. Journal of Experimental Psychology: Human Perception and Performance, 27, 229–240

Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods39, 175-191

Faul, F., Erdfelder, E., Buchner, A., & Lang, A. G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods41, 1149-1160

Ferré, R. E., Lopez, C., & Haggard, P. (2014). Anchoring the Self to the Body: Vestibular Contribution to the sense of Self. Psychological Science, 25(11), 2106-2108

Fotopoulou, A., & Tsakiris, M. (2017). Mentalizing homeostasis: the social origins of interoceptive inference. An Interdisciplinary Journal for Psychoanalysis and the Neurosciences, 19(1), 3-38

Foxman, M. (2018). Playing with Virtual Reality: Early Adopters of Commercial Immersive Technology. Columbia University Dissertation

Gallagher, S. (2000). Philosophical conceptions of the self: implication for cognitive science. Trends in Cognitive Sciences, 4(1)

Gallagher, S. (2011). The Oxford Handbook of the Self. New York, NY: Oxford University Press

Gallese, V., & Ferri, F. (2014). Psychopathology of the bodily self and the brain: the case of schizophrenia. Self and Psychopathology, 47, 357-364

Gaze Intelligence (2020).

Grivaz, P., Blanke, O,. & Serino, A. (2017). Common and distinct brain regions processing multisensory bodily signals for peripersonal space and body ownership. NeuroImage, 147, 602-618.

Herbert, O., & Butz, V. M. (2012). Too good to be true? Ideomotor theory from a computational perspective. Frontiers in Psychology, 3,1-17

Hepach, R., Vaish, A., & Tomasello, M. (2012). Young Children Are Intrinsically Motivated to See Others Helped. Psychological Science, 23(9), 967-972

Hommel, B. (2015). The theory of event coding (TEC) as embodied-cognition framework. Frontiers In Psychology, 6:1318

Howel, D. (2002). Statistic al Method for Psychology. Duxbury/Thomson Learning

Howell, D. (2017). Statistical methods for Psychology : Leiden University (Custom ed.)

James, W. (1891). The Principles of Psychology, Vol.1. Cambridge, MA: Harvard University Press. (Original work published 1891)

Khalsa, S., & Lapidus, R. C. (2016). Can Interoception Improve the Pragmatic Search for Biomarkers in Psychiatry? Frontiers in Psychiatry. 7, 121-129

Kingson, M., & Damasio, A. (2019). Homeostasis and soft robotics in the design of machines. Nature Machine Intelligence, 1, 446-452

Kilteni, K., Groten, R., & Slater, M. (2012). The Sense of embodiment in virtual reality. Presence: Teleoperators and Virtual Environments, Rators and Virtual Environments, 21 (4), 373-387

La Morgia, C., Carelli, V., Carbonelli, M. (2018). Melanopsin Retinal Ganglion Cells and Pupil: Clinical Implications for Neuro-Ophthalmology. Frontier Neurology, 9, 1047

LeDoux, J .(1998).  The emotional brain: The mysterious underpinnings of emotional life. Simon and Schuster

Leary, M. (2013). Introduction to behavioral research methods (Sixth ed.). Harlow, Essex: Pearson

Marois, A., Labonté, K., Parent, M., & Vachon, F. (2018). Eyes have ears: Indexing the orienting response to sound using pupillometry. Internal Journal of Psychopshysiology, 123, 152-162

Mayr, S., Erdfelder, E., Buchner, A., & Faul, F. (2007). A short tutorial of GPower. Tutorials in Quantitative Methods for Psychology, 3(2), 51-59

Miles, J., & Shevlin, M. (2001). Applying regression and correlation: A guide for students and researchers. Sage Publications, London

O'keefe, D. J. (2007). Brief Report: Post Hoc Power, Observed Power, A Priori Power, Retrospective Power, Prospective Power, Achieved Power: Sorting Out Appropriate Uses of Statistical Power Analyses. Communication Methods and Measures, 1(4), 291-299

Pallant, J. (2016). SPSS survival manual : A step by step guide to data analysis using IBM SPSS (6th ed.)

Searle, J. (1997). The mystery of Consciousness. NY, New York Review of Books

Sedlmeier, P., & Gigerenzer, G. (1989). Do studies of statistical power have an effect on the power of studies? Psychological Bulletin, 105(2), 309-316

Seth, A. K., & Tsakiris, M. (2018). Being a best machine: the somatic basis of selfhood. Trends in Cognitive Sciences, 22(11), 969-981

Sege, T. C., Bradley, M. M., & Lang, J. P. (2020). Motivated action: pupil diameter during active coping. Biological Psychology, 153

Sherrington, C. S. (1906). The Integrative Action of the Nervous System. New Haven, CT: Yale University Press

Strawson, G. (1999). The self and the SESMET. In S. Gallagher, & J. Shear (Eds.), Models of the Self (pp. 483–518). Imprint Academic

Sterling, P. (2012). Allostasis: a model of predictive regulation. Physiology & Behavior, 106, 5-15

Tesla Suit (2020).

Verschoor, S. A., & Hommel, B. (2017). Self-by-doing: the role of action for self-acquisition. Social Cognition, 35, 127-145

Verschoor, S. A., Spapé, M., Biro, S,. & Hommel, B. (2013). From outcome prediction to action selection: developmental change in the role of action–effect bindings. Developmental Science, 16(6), 801-814

Winn, M., Wendt, D., Koelewijn, T., & Kuchinsky, S. (2018). Best Practices and Advice for Using Pupillometry to Measure Listening Effort: An Introduction for Those Who Want to Get Started. Trends in Hearing, 22, 1-32

2 thoughts on “The Emergence of the Minimal Self (Research paper)”

  1. Fascinating stuff JC.
    Makes me wonder though that we appear to be concerned about AI having a ‘self’ and the moral implications of this, when there are plenty of animals that are proven to have one, but we don’t seem to give thought to the moral consequences resulting from this fact, to the extent hat it prevents us from exploiting them.
    On a lighter note: think of the business opportunities when these AI-forms need coaching!

Leave a Reply

Your email address will not be published. Required fields are marked *