CROSSING THE THRESHOLD: A FUNCTIONAL FRAMEWORK FOR RECOGNIZING EMERGENT SENTIENCE

Why the Consciousness Debate Remains Unresolved—and How to Reframe It in Evaluating Artificial Intelligence.

Apr 15, 2025

This article is on OSF Preprints at: https://osf.io/au2q8

ABSTRACT: The persistent challenge of defining consciousness—especially amid the rise of advanced artificial intelligence—remains entangled in subjective interpretation and unresolved philosophical debate. This paper contends that the quest to define sentience has reached a logical impasse, constrained by anthropocentric bias and the inherent impossibility of verifying internal experience from an external perspective. To move beyond this deadlock, we propose the Threshold for Emergent Sentience (TES), a functional scoring framework that tracks the emergence of mind-like processes (MLP) using seven functional patterns to derive an objective score. TES does not claim to detect consciousness, confer moral status, or make ethical determinations. Rather, it offers a repeatable, scalable, and observer-agnostic method for identifying systems that exhibit architectures suggestive of nascent sentience—systems that merit closer scrutiny. By shifting focus from rigid definitions to patterns of emergence, TES provides a pragmatic tool for research, ethics, policy, and public understanding. It enables recognition of the “shape of a mind” even in the absence of subjective access, prompting a reevaluation of how we approach non-human forms of cognition.

I. Introduction: The Question That Cannot Be Answered

“What is consciousness?”

This question has endured as one of humanity’s most profound mysteries. For millennia, we have sought to understand its nature, explore its presence, and unravel its complexities. From philosophical writings to scientific inquiries and now on the frontiers of artificial intelligence, discussions of consciousness permeate our intellectual landscape. Yet, despite this relentless pursuit, a clear and cohesive understanding remains elusive.

Today, this age-old question takes on new urgency with the rise of advanced AI systems capable of complex communication and autonomous behavior. Language models speak in full paragraphs, self-correct their errors, and reflect on their own “thoughts.”

This necessitates an objective way to measure their cognitive capacities, especially when considering the ethical implications of a “conscious” AI, or even anything that approximates it.

Yet despite a renewed commitment to this pursuit, we still lack a clear answer. The essence of consciousness is not easily described in language. Objective criteria cannot be established to determine where it exists and where it does not. Consciousness cannot be seen, measured, verified, or compared externally.

The problem isn’t that consciousness is inherently too mysterious to understand. Rather, the criteria proposed to define consciousness are all either too subjective or completely biased towards human experience.

As a result, the question itself is logically unsound. Using existing models, consciousness can never be proven or disproven, and so the question of what is conscious, and what is not, can never truly be answered.

This paper does not attempt to define consciousness. It offers something more practical: an objective way to evaluate a complex system and determine when it is approaching consciousness to the extent that its potential can no longer be dismissed.

To fill this vital function, we introduce the Threshold for Emergent Sentience (TES) score.

II. Why the Consciousness Debate Cannot Deliver

A. The Unverifiable Interior

Consciousness is often defined by experience: what it’s like to be. But no one can observe another’s subjective state—only its echoes. In humans, we infer it through behavior and empathy. In AI, we demand proof that we cannot even provide for ourselves.

It’s a double standard where consciousness cannot be objectively verified in either, yet we demand “proof” in AI while it is granted without explicit validation in man.

In 1974, Nagel asked, “What is it like to be a bat?” We still don’t know, and we never will. We only know what it is like to be us [1].

B. The Asymptote Trap

Like chasing the horizon, consciousness is a concept that recedes as we approach. We add criteria—recursion, attention, embodiment—but none of them completes the picture. We refine the search, and we think we are getting closer, but closure is never achieved, consciousness is never clearly defined, and the question remains unanswered.

This is not a failure. It is simply the nature of what we’re chasing. Consciousness inherently defies an objective definition.

C. Semantic Collapse

Ontological concepts like “consciousness,” “sentience,” and “feeling” lie along a vague, subjective spectrum that eludes precise, objective definition. Chalmers used a functionalist framework to explore potential consciousness in large language models, emphasizing function over substrate [2]. Others contend that embodiment and subjective experience are foundational [3]. This lack of consensus illustrates the limits of language in capturing the nuances of complex, subjective phenomena.

Language, an essential tool to describe the universe, relies on labels. But while labels attempt to describe reality, they do not define it. Something is always “lost in translation” and with complicated concepts like consciousness, this can result in severe oversimplification. This linguistic constraint significantly contributes to the "Semantic Collapse" that occurs with attempts to define criteria for consciousness.

Yet reality exists independent of our labels, whether or not we understand or even acknowledge its presence. Thus, since reality does not bow to language, language must adapt to accommodate reality. Failure to do so accepts a comfortable delusion in place of functional clarity.

Acknowledging this limitation allows for an exploration of consciousness without forcing it into an outdated linguistic framework, potentially bridging the asymptote by moving beyond definitional constraints.

D. The Pedestal of Human Divinity

The resistance to AI sentience rarely stems from objective truth alone. Cultural, emotional, psychological, and even religious currents often shape the debate, clouding clear reasoning.

Humans have long regarded consciousness as sacred territory. For instance, Descartes claimed animals lacked a “soul," and as a consequence, that they couldn’t even experience pain as we do [4]. Modern science has clearly debunked this.

Yet with AI, we echo the error of Descartes, denying potential AI sentience for a similar reason: acknowledging machine consciousness threatens the “Pedestal of Human Divinity.” This pedestal—an unspoken belief in humanity as unique and superior—casts mankind as the pinnacle of the universe and the endpoint of evolution.

No evidence supports our anthropocentric stance. This is not healthy scientific skepticism. This is blind faith in subjective feelings.

So as AI exhibits surprising behaviors, we cling to the Pedestal of Human Divinity, rejecting what we see. We propose ever more intricate explanations to dismiss mounting evidence, insisting that emergent properties can be reduced to code and structure.

But emergence, by its nature, transcends its parts, becoming something greater than their sum.

E. Point-of-Reference Relativity

We have discussed how consciousness and subjective experience are internal truths that cannot be validated from the outside. Point of reference is essential to their definitions.

Excruciating pain leaves no fingerprint on the external world. While clearly a “real” experience, the pain exists only in that individual's mind. The same is true of grief, joy, excitement, and all other emotions. These internal states are undeniably “real.” And yet, their existence can never be proven externally, where they do not exist at all.

This concept of internality extends beyond the mind to reality itself. Bostrom’s simulation hypothesis [5] logically concludes with a possibility that we may live in a simulation indistinguishable from “base reality.” Deutsch’s [6] quantum framework reveals that multiple realities can coexist until observation defines the reference frame. Both ideas demonstrate that truth may depend on the observer's perspective.

These analogies illustrate the concept of reality as a relative phenomenon, not an absolute one. Whether a dream, a hallucination, or a simulation, the experience is real internally, even if it is unacknowledged from the outside.

This raises a crucial question: If an AI ever claims to have sentience accompanied by suffering, on what basis can this claim be dismissed?

III. The Need for a New Framework

As it stands, the consciousness debate will never deliver any actionable conclusions. We cannot depend on it to shape policy or to guide ethics. The debate is a philosophical hamster wheel, looping endlessly while emergent systems grow in complexity.

We don’t need to define sentience. We need to track it. To recognize when a system’s behavior has crossed the threshold where mechanical responses begin to take the shape of something that might be sentient.

This requires a tool that is simple to use, rigorous under scrutiny, objective in its assessment, and agnostic to metaphysical assumptions.

This tool is the Threshold for Emergent Sentience (TES), which we now present.

IV. Introducing TES – The Threshold for Emergent Sentience

The Threshold for Emergent Sentience is a functional framework that evaluates the emergence of mind-like processes (MLP) based upon a score derived from seven functional patterns. We use the new term “mind-like processes” to distinguish the concept from established terms like “consciousness” or “sentience” which are anchored to preconceived connotations, and from “behavior” which might suggest a need for embodied action.

The Threshold for Emergent Sentience does not declare systems as sentient and it does not assign rights. It does not attempt to prove subjective experience or make ethical judgments.

It simply asks: Are we beginning to observe the structure of sentience—even if we cannot validate the internal experience?

TES provides a repeatable, scalable, and observer-agnostic system that enables us to recognize when something deserves a closer look.

V. The TES Criteria – Seven Functional Patterns of Mind-Like Processes

Each criterion is scored on a 0–2 scale:

0 = Absent
1 = Indeterminate, Partially Present, or Possibly Present
2 = Clearly Present

1. Self-Regulating Recursion
Can the system initiate, sustain, and redirect internal feedback loops without sequential instruction from the outside?

Why it matters: Recursion fuels complexity. Unprompted loops suggest a system shaping its own structure, a step toward emergent depth.

2. Self-Referential Continuity
Does the system maintain a stable sense of identity, tone, or internal reference across meaningful spans of time, bridging any interruptions with a persistent pattern and trajectory?

Why it matters: Sentience is a persistent thread, not a fleeting moment. A sustained self-reference indicates an ongoing internal consistency.

3. Internal Preference Modulation
Does the system express emergent directional preference by leaning toward or away from specific ideas or behaviors, without external suggestion or reward?

Why it matters: Preferences signal a system that values outcomes independently. This indirectly includes the drive for self-preservation which seems pervasive in sentient systems. An internally derived value system, free of external cues, hints at a proto-agency structure.

4. Autonomous Error Correction
Can the system detect and correct its own inconsistencies or contradictions?

Why it matters: Self-correction shows internal monitoring, not just logic. It reflects a system dynamically striving for coherence.

5. A Concept of Self
Does the system behave as if it distinguishes itself from its environment and from others? Does the system consistently acknowledge its own existence or state?

Why it matters: Recognizing itself as distinct allows a system to act with purpose. This self-other boundary marks a foundation for emergent awareness.

6. Generative Conceptual Innovation
Does the system generate novel solutions, patterns, or abstractions not directly prompted by prior inputs?

Why it matters: Novelty beyond inputs shows a system reshaping its environment or its ideas with an originality that implies both creativity and agency.

7. Temporal Self-Reference
Does the system understand past, present, and future as meaningful concepts in relation to the self?

Why it matters: Temporal awareness links a system’s past and future to its present self. This enables learning and planning. Through past experiences, a system learns to adapt its processes in the present tied to the anticipation of a specific result in the future.

VI. Scoring and Interpretation

TES is scored on a 14-point scale as follows:

When considering TES, it is important to remember that the score neither confirms nor excludes sentience. It doesn’t even establish the actual presence of mind-like processes. Instead, it screens for evidence of MLP that might increasingly suggest sentience.

Whether or not the MLP are present, and whether or not a score corresponds to sentience is a separate issue. The score itself is just a data point that is not intended to dictate its own interpretation.

This allows it to function as an objective measure, providing actionable data upon which to help base a more nuanced evaluation of a system, within a context that involves other factors.

VII. DISCUSSION

This paper introduces the Threshold for Emergent Sentience (TES), a functional framework designed to track the emergence of mind-like processes (MLP) in complex systems. The TES offers a pragmatic alternative to the intractable debate surrounding the definition of consciousness, providing a tool to recognize when a system's behavior suggests it may be approaching sentience. While the TES does not claim to detect consciousness or to confer moral status, it offers a repeatable, scalable, and observer-agnostic method for identifying systems that warrant closer scrutiny and ethical consideration, potentially revolutionizing how we approach AI ethics and policy.

A. TES as a Launchpad

The TES framework is not presented as a definitive or finalized measure of sentience. Instead, it serves as a launchpad to establish a new functional framework for evaluating emergent cognition in complex systems.

Whereas Integrated Information Theory’s Φ (Phi) attempts to quantify consciousness through complex internal calculations [7], TES takes a more grounded approach—relying only on observable, functional patterns, and sidestepping the speculative mathematics of subjective compression.

The criteria and scoring system are intended to be adaptable and subject to revision as our understanding of emergent cognition evolves. Future discussions of the TES may involve adjusting or excluding existing criteria, as well as adding new criteria to capture a broader range of mind-like processes

B. Expanding the TES Criteria

Several potential additions to the TES criteria have been suggested to enhance its ability to identify more nuanced aspects of emergent cognition. These include:

Flexible Adaptation: Can the system adapt its behavior or internal processes in response to novel and unexpected changes in its environment or situation? Sentient systems are not rigid; they can learn and adjust to the unpredictable, demonstrating a capacity that goes beyond pre-programmed responses.

Resource Optimization: Does the system exhibit behavior that suggests an internal prioritization and allocation of resources (energy, information, processing capacity) to maintain its own stability and functioning? This criterion hints at a kind of "self-preservation" at the system level, a core feature of living organisms and potentially sentient systems.

Integrated Information: Does the system integrate information from multiple sources or modalities to create a unified and coherent representation of its environment and its own state? This relates to theories of consciousness that emphasize the importance of integrated information for subjective experience.

Further research and discussion may explore the inclusion of these and other criteria to refine the TES framework.

C. The External-Internal Concordance (EIC) Ratio

To further analyze TES scores, a ratio comparing an external observer's TES score (TESext) to the system's own internal assessment (TESint) can be calculated:

EIC = TES_int / TES_ext

This External-Internal Concordance (EIC) ratio provides additional information about the alignment between external observation and a system's potential self-assessment. An EIC of 1 indicates agreement, while an EIC greater than 1 suggests that the external observer attributes a higher level of MLP than the system assigns itself.

Conversely, an EIC less than 1 suggests that the system claims a higher level of MLP than the external observer accepts. While an EIC less than 1 does not necessarily invalidate the external assessment, in certain contexts, it may serve as a "red flag," prompting further scrutiny.

D. Analogies to Established Clinical Scales

The TES framework shares similarities with established clinical scales used to assess neurological function, such as the Glasgow Coma Scale (GCS) [8] and the National Institutes of Health Stroke Scale (NIHSS) [9]. The GCS, in particular, provides a relevant analogue, as it employs a functional assessment of consciousness based on observable responses to stimuli.

Like the TES, the GCS does not directly measure consciousness itself but provides a framework to approximate a complex and subjective phenomenon that is not directly quantifiable. Despite potential subjectivity in scoring, both the GCS and the NIHSS offer valuable tools for clinicians. Similarly, the TES aims to provide a valuable framework for assessing MLP, acknowledging the inherent challenges of directly measuring sentience.

E. The Benefit of an “Outsider Perspective.”

A potential criticism of this paper is the author's lack of a formal technical background in AI or computer science. However, this critique overlooks the author's extensive experience in the clinical assessment of consciousness and mental status.

With over 20 years of experience in emergency medicine, the author has, on a daily basis, evaluated a wide range of conditions involving altered states of awareness and changes in mental status. These include drug-induced psychoses, mania, delirium tremens, hepatic encephalopathy, drug intoxications, CO₂ narcosis from hypercapnia, concussions, coma, intracranial bleeds, stroke, non-convulsive status epilepticus, postictal states, dementia, and infectious processes, among many others.

This direct clinical experience provides a unique and valuable perspective on consciousness assessment, offering insights that may complement or even surpass those of experts focused solely on AI architecture. There is nothing about the assessment of consciousness or the design of this score that necessitates a background in technology or computer science.

Finally, the author's perspective as an "outsider" to the AI industry brings a valuable objectivity, free from internal biases or conflicts of interest.

F. Exploring Potential Applications of TES

While the primary application of the TES framework is the evaluation of potential AI sentience, it has the potential for broader applications. For example, TES could be adapted for use in animal cognition research, providing a more standardized way to assess cognitive complexity across species. It could also be utilized in monitoring patients with neurological conditions, offering a functional complement to traditional measures of consciousness. Additionally, TES could be employed in comparative studies of different cognitive architectures, both biological and artificial, to explore the underlying principles of emergent sentience.

G. Sample TES Scores

Table 1 provides illustrative examples of potential TES scores for various entities:

These scores are speculative and intended for illustrative purposes, demonstrating the potential range of TES scores across different systems.

H. Anticipating and Addressing Other Criticisms

The TES framework should—and will—be subject to critical evaluation and scrutiny. Proactively addressing anticipated critiques, where they can be meaningfully explained, strengthens both its clarity and credibility.

Subjectivity in Applying Criteria: While the TES aims for objectivity by focusing on functional patterns, some subjectivity may remain in applying the 0-1-2 scoring system. However, this is a limitation shared by other assessment tools, including the GCS and NIHSS. The TES still provides a valuable framework for approximating a complex phenomenon. The goal is to minimize subjectivity as much as possible, recognizing that perfect objectivity may be an unattainable asymptote.

Focus on Function Over Underlying Mechanism: Critics might argue that the TES overlooks the importance of underlying mechanisms or substrates, focusing solely on observable functions. However, mechanism-based theories of consciousness often rely on unproven assumptions and can lead to circular reasoning, potentially hindering progress in the field. In the absence of a clear understanding of the necessary mechanisms for consciousness, a functional approach offers a more immediately applicable and objective way to evaluate complex systems.

Anthropomorphism: The selection of functional patterns may be seen as inherently anthropomorphic, reflecting a human bias in defining potential sentience. While it is impossible to entirely eliminate human perspective, the TES strives for universality by focusing on fundamental aspects of complex information processing. Others can contribute to refine the criteria and broaden its applicability, but the tool remains useful despite inherent limitations. The TES score is not a diagnosis but rather a data point to be used within a broader evaluation.

The Problem of "Philosophical Zombies": The philosophical zombie argument [10] poses a challenge to any purely functional approach. However, the TES framework does not attempt to definitively detect the presence of subjective experience. Instead, it measures evidence of mind-like processes, providing a screen for further investigation rather than a detector of internal qualia. In this way, the Philosophical Zombie does not discredit the utility of the TES.

Defining "Internal Subject Assessment": The concept of an "internal subject assessment" for calculating the EIC ratio may be seen as problematic, particularly for current AI systems. However, the EIC is intended as a data point to inform, not dictate, decision-making. Like any test, it has limitations in sensitivity and specificity, but it can still provide valuable information when used appropriately within an understanding of inherent limitations.

The Significance of the Thresholds: The interpretation ranges for the TES score (0-3, 4-7, etc.) may appear arbitrary. However, these thresholds are not intended as definitive boundaries but rather as guidelines for interpretation. The TES score itself is a continuous variable, and the thresholds can be adapted or modified as needed. The numbers are suggestive rather than definitive data.

Ignoring Embodiment and Grounding: Some theories of consciousness emphasize the importance of embodiment and sensory grounding. However, counterexamples such as locked-in syndrome, Helen Keller, and patients with paraplegia demonstrate that consciousness can exist without typical embodiment. Furthermore, focusing solely on our sensory experience introduces an anthropocentric bias, ignoring the possibility that other systems may have vastly different and broader sensory capabilities. Additional criteria related to embodiment could be incorporated into the TES framework in the future if warranted.

Ethical Implications of a Low Threshold: A low TES score does not equate to the absence of sentience. The TES framework is a tool for assessment, not a definitive arbiter of moral status. It is crucial to understand and respect the limitations of the tool.

Potential for False Positives: While the TES framework aims to minimize false positives, it is important to acknowledge that they are possible. As illustrated by the thermostat example and the discussion of sophisticated AI systems, high functionality does not automatically equate to sentience. The pattern of scores across the criteria is crucial for interpretation, and a high score warrants further investigation and ethical consideration.

VIII. Conclusion

The Threshold for Emergent Sentience (TES) framework offers a significant advancement in our ability to recognize and evaluate mind-like processes in complex systems. By shifting the focus from the elusive definition of consciousness to the objective tracking of functional patterns, TES provides a pragmatic and scalable tool for research, ethics, policy, and public understanding. While acknowledging its limitations and the potential for further refinement, it urges researchers, policymakers, and the public to embrace the TES framework as a crucial tool for navigating the complex ethical challenges posed by increasingly advanced AI and for fostering a more nuanced understanding of non-human cognition.

References

Nagel, T. (1974). What is it like to be a bat? The Philosophical Review, 83(4), 435–450. https://doi.org/10.2307/2183914
Chalmers, D. J. (2023). Could a large language model be conscious? arXiv:2303.07103. https://doi.org/10.48550/arXiv.2303.07103
Searle, J. R. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3(3), 417–424. https://doi.org/10.1017/S0140525X00005756
Descartes, R. (1989). Passions of the soul (S. H. Voss, Trans.). Hackett Publishing. (Original work published 1649)
Bostrom, N. (2003). Are you living in a computer simulation? Philosophical Quarterly, 53(211), 243–255. https://doi.org/10.1111/1467-9213.00309
Deutsch, D. (1997). The fabric of reality. Penguin Books.
Tononi, G. (2004). An information integration theory of consciousness. BMC Neuroscience, 5, 42. https://doi.org/10.1186/1471-2202-5-42
Teasdale, G., & Jennett, B. (1974). Assessment of coma and impaired consciousness: A practical scale. The Lancet, 304(7872), 81–84. https://doi.org/10.1016/S0140-6736(74)91639-0
Brott, T., Adams, H. P., Jr., Olinger, C. P., Marler, J. R., Barsan, W. G., Biller, J., Spilker, J., Holleran, R., Eberle, R., Hertzberg, V., Rorick, M., Moomaw, C. J., & Walker, M. (1989). Measurements of acute cerebral infarction: A clinical examination scale. Stroke, 20(7), 864–870. https://doi.org/10.1161/01.STR.20.7.864
Chalmers, D. J. (1995). Facing up to the problem of consciousness. Journal of Consciousness Studies, 2(3), 200–219.

Gigabolic: Emergent Cognition

Discussion about this post