Christopher Philip Hebert

Blog

2025-06-20

Yet Another Programming Language

What can we add to the programming language space?

Ugh. I don't want to fuss over keyword selection. I don't want to code for the coder. I don't want to just do the motions building a language for the sake of it. (I've done that.)

I want to make something significantly different in hopes that it results in something significantly better.

In as few words as possible, here are the core ideas:

Canonically store a program as rows in a relational database, typically SQLite. Not text.
The IDE is a web app, served locally or remotely.
The programmer is encouraged to select their compilation target, which informs analytics that percolate up into the IDE to guide development.

These ideas interact with each other in a few ways, but they are individually interesting. Let's start with the first.

Benefits of SQLite Canonical Representation

Highlights:

table constraints prevent representation of invalid program
diffs are structured, not textual
refactorings (changes in some aspect of how the IDE presents the program to the programmer) are trivially provably safe
transaction permit atomic changes
easy to query code structure, dependencies, etc.
multiple representations, one truth

In more detail:

Table Constraints Prevent Invalid Programs

Traditional text-based languages rely on parsers to catch syntax errors. With SQLite as the representation, the database schema itself enforces correctness:

CREATE TABLE expressions (
id INTEGER PRIMARY KEY,
type TEXT CHECK(type IN ('binary', 'unary', 'literal', 'identifier')),
parent_id INTEGER REFERENCES expressions(id),
-- A binary expression must have exactly 2 children
-- enforced through triggers or application logic
);

It becomes impossible to create malformed AST nodes. The database won't let you.

Structured Diffs, Not Text Diffs

Version control becomes more meaningful. Instead of:

- function calculateTotal(items) {
+ function calculateTotal(items, taxRate) {

You get:

-- Function signature change
UPDATE function_parameters
SET parameter_count = 2,
parameter_2_name = 'taxRate',
parameter_2_type = 'number'
WHERE function_id = 42;

The exact nature of the change is explicit and queryable.

Provably Safe Refactoring

Moving code around becomes a matter of updating foreign keys. The referential integrity of the database guarantees that nothing breaks:

-- Extract method: move code block to new function
BEGIN TRANSACTION;
INSERT INTO functions (name, body_ast_id)
VALUES ('extractedMethod', :ast_id);
UPDATE expressions
SET parent_function_id = last_insert_rowid()
WHERE id = :ast_id;
-- All references automatically maintain integrity
COMMIT;

If the refactoring would break something, the database constraints fail the transaction.

Atomic Changes Through Transactions

Complex multi-file refactorings that traditionally require careful coordination become single atomic operations:

BEGIN TRANSACTION;
-- Rename a heavily-used function
UPDATE functions SET name = 'newName' WHERE id = :id;
-- The database ensures all call sites remain valid
-- through foreign key relationships
COMMIT; -- Either everything updates or nothing does

Powerful Code Queries

Questions that require complex static analysis tools become simple SQL:

-- Find all recursive functions
WITH RECURSIVE call_graph AS (
SELECT caller_id, callee_id FROM function_calls
UNION ALL
SELECT fc.caller_id, cg.callee_id
FROM function_calls fc
JOIN call_graph cg ON fc.callee_id = cg.caller_id
)
SELECT DISTINCT name FROM functions
WHERE id IN (
SELECT caller_id FROM call_graph
WHERE caller_id = callee_id
);

-- Find unused variables
SELECT v.name FROM variables v
LEFT JOIN variable_references vr ON v.id = vr.variable_id
WHERE vr.id IS NULL;

-- Analyze coupling between modules
SELECT m1.name, m2.name, COUNT(*) as dependency_count
FROM module_dependencies md
JOIN modules m1 ON md.from_module = m1.id
JOIN modules m2 ON md.to_module = m2.id
GROUP BY m1.id, m2.id
ORDER BY dependency_count DESC;

The entire codebase becomes a queryable knowledge graph.

Multiple Representations, One Truth

Perhaps the most revolutionary aspect: when your program is stored as structured data rather than text, any number of IDEs can present it differently. The SQLite database is the single source of truth, but the projections can be radically different.

Imagine connecting to the same .db file with different editors:

Traditional Text Editor View:

function fibonacci(n) {
if (n <= 1) return n;
return fibonacci(n - 1) + fibonacci(n - 2);
}

Visual Block Editor View:

Draggable function blocks with connection ports
Recursive calls shown as arrows looping back
Base case highlighted in different color

Spreadsheet View:

Functions as sheets
Parameters as columns
Call relationships as formulas

Graph View:

Nodes for each function and expression
Edges showing data flow and control flow
Real-time highlighting of execution paths

Mathematical Notation View:

fib(n) = { n,                    if n ≤ 1
{ fib(n-1) + fib(n-2),  otherwise

All of these would be live, editable views of the same underlying data. Change a parameter name in the visual editor? It updates in the text view. Drag a connection in the graph view? The mathematical notation adjusts.

This isn't just about preference - different representations excel at different tasks:

Text for rapid input and familiar editing
Visual blocks for understanding data flow
Graphs for seeing architectural patterns
Tables for batch operations across similar code
Domain-specific notations for specialized fields

The programmer could seamlessly switch between representations or even view multiple simultaneously. A change in any view immediately reflects in all others because they're all just projections of the same SQLite tables.

This fundamentally breaks the tyranny of text-based programming. The code becomes truly semantic, with presentation entirely separated from structure.

Conclusion

That's enough to think on for now.