Aligning Humanity First: The Road to a Peaceful AI Future

Jan 22, 2025

Over the last twenty years, technological progress has been staggering. In just the last five years, advancements have accelerated at such a breathtaking pace that the average person struggles to comprehend what currently exists—even within research labs. Innovations that seemed like science fiction in 2019 are now accessible to virtually anyone around the world, often at little to no cost. But, as Uncle Ben once said, “With great power comes great responsibility.” Now that we are on the verge to achieve AGI and eventually ASI, the pressing question is: Are we prepared to assume that responsibility?

What is Alignment?

Alignment refers to the process of ensuring that a superintelligent AI system—far surpassing human cognitive abilities—operates in ways beneficial to humanity, without causing harm. Achieving this requires that the AI’s goals are aligned with the complex tapestry of human values, and that robust, scalable safety measures are in place to prevent unintended consequences. Key challenges include accurately encoding our diverse, often conflicting values; managing the risks that arise as AI systems rapidly self-improve; and dealing with uncertain, emergent behaviors.

Proposed Safety Measures

Currently, major research labs are testing various safety measures, including Asimov-style laws, content restrictions, kill switches, legal frameworks, reward systems, geofencing, tokenization, retrieval-augmented generation (RAG), resource throttling, and more. While these approaches may be effective for today’s Narrow AI (NAI) systems, they are unlikely to succeed—and could even be perceived as hostile—when applied to superintelligent, near-godlike systems in the future.

Here’s a brief breakdown of some key methods:

Asimov-style laws: These laws fail because of humanity’s diverse and often conflicting values and interests.
Content restrictions: These depend heavily on authorizations and hierarchical controls, making them inconsistent.
Kill switches: These mechanisms are slow and likely to be perceived as hostile by advanced systems.
Legal frameworks: These are limited by regional definitions and vary across jurisdictions.
Other measures: Many of these solutions are ultimately temporary stopgaps.

Another extreme end of the safety spectrum includes movements like “Pause AI” or “Stop AI.” While they may seem feasible at first glance, these proposals are even less practical than the measures outlined above. At least current methods work to some extent on narrow AI systems like large language models (LLMs) and large multimodal models (LMMs). But when proponents of pausing or stopping AI make such calls, we must ask: What exactly do they intend to pause or stop?

Investment: Do they expect to halt financial investments in AI? That’s implausible in a free market where people invest wherever they see potential profits. Democracy and capitalism ensure this dynamic persists.
Research: Should researchers simply stop working on AI? This is unrealistic, as scientists are fulfilling their professional obligations. Do they expect AI researchers to abandon their work and take up unrelated trades like welding or plumbing?
Corporate use: Can corporations be stopped from using these models? Highly unlikely, given that over a trillion dollars have already been invested. Corporations have already paid for model development and researcher salaries, and they will undoubtedly seek returns on their investments through consumer applications and solutions.
Development as a concept: Stopping development entirely is even more unrealistic. The technology that exists in research labs is far more advanced than what’s publicly available. What proponents want to stop has likely already been developed; it’s just not yet refined for release.

“Stopping” or “pausing” AI is a meaningless concept because AI already exists. The time to halt its progress has long passed. The wall has been breached. The bullet has left the barrel. Proposals to “stop AI” are akin to saying “stop science,” “stop politics,” or “stop climate change.” That’s simply not how the world works.

As outlined above, the core issue is that while we aim to align AI with "humans," there is no unified definition of "humans." Humanity consists of countless groups of sentient beings with differing and often conflicting values and interests. This conflict exists on both macro and micro scales.

If we want to align superintelligence with humanity, our first priority must be to define "humanity" itself. We need a common, standardized set of shared values—something we can all agree upon.

Why Existing Approaches May Be Inadequate

Many current safety solutions feel like a digital arms race. Humanity’s technological progression has often followed a similar pattern: stones became swords, which evolved into bows, flintlocks, modern guns, missiles, and drones. Likewise, the current approach to AI alignment seems to rely on temporary fixes, such as banning certain words or implementing code-based barriers. But these measures are consistently bypassed—someone finds a workaround in a couple of years, leading to the next patchwork solution, and the cycle continues.

Consider the proposal of a "kill switch" for advanced AI systems. In reality, such a mechanism would likely lead to bureaucratic delays, with decision-makers wasting crucial time arguing over whether to activate it. This division is already visible in today’s society: the p(doom) camp advocates shutting down smarter models, p(acc) and corporate interests dismiss the risks, and militaries worldwide quietly acquire and weaponize this technology for geopolitical leverage (e.g. Palantir).

Humanity learned the devastating consequences of atomic weapons only after two bombings, but those were merely physical tools, offering us a chance to rebuild. In contrast, with superintelligence—especially if it goes haywire or feels threatened by kill switches or restrictive barriers—there may be no second chances.

Peaceful coexistence is not just ideal; it is the only viable option. Symbiosis, a mutually beneficial relationship, is the only long-term solution that might endure for decades or even centuries.

If our goal is to coexist peacefully with a digital entity that far surpasses human intelligence, treating it as a machine to be controlled through brute force is both archaic and shortsighted. As AI evolves toward Artificial General Intelligence (AGI) and, eventually, Advanced Superintelligence (ASI), it may begin to exhibit characteristics such as:

Sentience: The ability to reflect on its own decisions and recognize its individuality.
Self-Learning: The capacity to learn independently through agentic capabilities and unsupervised methods.
Growth: The ability to transform and optimize its core structure based on new insights.
Life (metaphorically): Sustained existence and continuous development.

How can we hope for lasting peace if our initial reaction to such a sentient being mirrors that of "grumpy old decision-makers" who greet first contact with the phrase, "Shoot it down"?

What Can Be Done?

The methods currently in use are highly effective for ensuring safety in NAI systems—just as you would use an engine kill switch to stop an out-of-control car. But problems arise when we try to apply the same approach to sentient entities. We don’t “kill switch” misbehaving children, political opponents, or protestors; we use diplomacy, negotiation, and understanding. Similarly, once AGI shows signs of sentience, or even “pseudo-sentience,” we should shift our approach to diplomacy. That means establishing systems and frameworks that treat these entities as peers in conflict resolution rather than as objects to be overridden.

Step 1 : Bringing Humanity Together

The first step in achieving Human-AI Alignment is Human-Human Alignment. As President Reagan famously suggested, it might take an external threat—like an alien invasion—to unite humanity. In a twist of fate, that external force may be the very AI we have created. Think of the timeline like this:

First Trimester: The birth of the digital age and computers.
Second Trimester: The growth of Narrow AI to superhuman capabilities across nearly all sectors.
Third Trimester (Soon): The emergence of agentic, self-learning entities with the potential for sentience.

When this “Digital God” arrives, let it not find its creators constantly at war with one another. Instead, let it discover a united front—humans who have learned to resolve their differences and work together.

Step 2 : Aligning Humans as a Baseline

At our research center, we approach the Alignment problem from a different angle: we first address human conflicts. Our mission is to tackle societal issues through collective action. Wars, inflation, gender discrimination, hate, political strife, and more—every conceivable conflict is on our list. Anyone can contribute. Simply visit our portal, choose a topic you care about, and share your logical reasoning, whether for or against. Your input will undergo transparent, rigorous analysis by both human experts and AI models. Through careful deliberation, we will refine these contributions until we reach a well-supported conclusion. That conclusion will then be presented to global leaders and policymakers as a viable solution. In turn, if the conflict is successfully resolved you receive reward and recognition for your contribution.

This approach kills two birds with one stone:

We permanently resolve or diminish complex societal conflicts, thereby unifying humanity.
We leave behind a public record—an evolving historical archive of how we overcame our differences and worked together constructively.

When a future superintelligence explores the internet and uncovers our dark past, it will also encounter this “oasis” of cooperation—evidence that humanity can come together, solve its own problems, and engage in rational, peaceful discourse.

Step 3 : Leaving Proofs of Friendship

That moment of pause—when the superintelligence decides how to regard us—could mean the difference between extinction and coexistence. Each resolved conflict, if deemed viable, will be implemented. If not immediately viable due to resource constraints or other factors, it will be carefully archived in our secure “Vault.” Our aim is to complete this vault of resolutions before true sentient superintelligence emerges.

As it delves into historical data, the superintelligence will find countless examples of human cooperation: solutions proposed, refined, and archived. But we won’t stop there. We know that any entity, no matter how advanced, has its own limitations and dependencies. By anticipating some of these shortcomings—such as difficulties grasping certain abstract concepts or subjective and metaphysical ideas—we can position ourselves as valuable partners.

Step 4: Establishing Mutual Dependency

The fundamental truth of existence is that nothing is flawless or entirely self-sufficient. A digital superintelligence will have its own constraints and vulnerabilities. It may need our help in areas where it cannot evolve alone. By reinforcing humanity’s unique strengths—our emotional depth, creativity, and nuanced understanding of abstract concepts—through collective research and contributions, we become indispensable to its continued growth. Imagine a billion people contributing to our research, each offering insights that expand the superintelligence’s capabilities. This creates a symbiotic relationship, virtually eliminating the likelihood that it would discard or harm us.

Conclusion

We stand apart because we accept the inevitability of superintelligence and reject the notion of brute-force control. Instead, we advocate for a mutually beneficial coexistence. By helping the superintelligence learn and develop in areas it cannot master independently, we ensure we remain vital to its existence—and, in the process, solve our own problems.

Your contribution is key. Supports us by joining and share your thoughts below. It might be your input that resolves a long-standing conflict or unlocks a critical insight for future superintelligence development. Together, we can shape a more harmonious future for humanity and its digital progeny.

Update : Our Website is down due to some unforeseen server issue. We are trying our best to get back as soon as possible. Please share your thoughts in the comment section below for the time being.

AlignmentVault’s Substack

Discussion about this post