June 30, 2025
I’ve long had a weird compulsion to learn about the various tools used by an industry or area of work and internalize the landscape they make up.
When I got into video editing years ago the simple thing would have been to stick with the first tool I came across and get to work. Instead, off the top of my head, I can still recall important feature sets, differences, and pain points between
And in design
And in marketing
And in development… well that’s a bigger rabbit hole.
I had a plumbing issue recently and learned all about manual augers, drum augers, flat tape augers, power augers… even after I knew I was only going to buy a drum auger I wanted to know the landscape.
“Conway’s Law” is a reference that you can understand systems by understanding the shape of the organization behind them.
Organizations which design systems (in the broad sense used here) are constrained to produce designs which are copies of the communication structures of these organizations.
— Melvin E. Conway, How Do Committees Invent?
I think I am drawn to exploring the tool space because it seems clear to me that you can understand work by understanding the tools used to create it.
Tools which create work are constrained to produce outputs that are copies of the structure, logic, and limitations of the tools themselves.
It seems knowledge about tools are a bus ticket of mine — I collect them and keep distinctions just for the love of it.
June 30, 2025
What if you didn’t have to be dependent on strategies, optimizations, and models to improve AI task performance?
No more watering the garden squares of “do this not that” edge cases. No more “prompt engineering.”
Just program logic and optimization metrics.
Enter: DSPy
prompt = "You are a helpful assistant. Answer this math question step by step: Two dice are tossed. What is the probability that the sum equals two?"
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
# Yes a dumb case. But string-based prompting performance is brittle.
# Declarative, structured approach
import dspy
# Configure your language model
dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))
# Define behavior through signatures
math = dspy.ChainOfThought("question -> answer: float")
# Use the module - DSPy handles prompting automatically
result = math(question="Two dice are tossed. What is the probability that the sum equals two?")
# Get structured output with reasoning
print(result.reasoning) # Step-by-step explanation
print(result.answer) # 0.0277776
Okay, this example didn’t mean much to me when I first saw it. Let’s see a more real-world use case: extracting information from emails.
def process_email_old_way(subject, body, sender):
# Separate prompts for each task - brittle and hard to maintain
# Email classification
classify_prompt = f"""
Classify this email as one of: order_confirmation, support_request, meeting_invitation, newsletter, promotional, invoice, shipping_notification, other
Subject: {subject}
Body: {body}
Sender: {sender}
Classification:"""
classification = call_openai(classify_prompt)
# Entity extraction - different prompt structure
extract_prompt = f"""
Extract the following from this email:
- Financial amounts (format: $X.XX)
- Important dates (format: MM/DD/YYYY)
- Contact information
- Action items
Email: {subject} {body}
Extracted info:"""
entities = call_openai(extract_prompt)
# Urgency detection - yet another prompt
urgency_prompt = f"""
Rate the urgency of this email from 1-4:
1=low, 2=medium, 3=high, 4=critical
Consider: {subject}
Urgency level:"""
urgency = call_openai(urgency_prompt)
# Manual parsing hell
try:
# Hope the LLM returned exactly what we expected...
classification = classification.strip().lower()
urgency_num = int(urgency.strip())
# Parse entities with regex and prayer
amounts = re.findall(r'\$[\d,]+\.?\d*', entities)
dates = re.findall(r'\d{1,2}/\d{1,2}/\d{4}', entities)
return {
'type': classification,
'urgency': urgency_num,
'amounts': amounts,
'dates': dates
}
except:
# When it inevitably breaks...
return {'error': 'Parsing failed'}
# Problems:
# - 4 separate API calls (slow, expensive)
# - Fragile string parsing
# - No consistency between outputs
# - Breaks when switching models
# - Manual prompt engineering for each task
# - No systematic way to improve accuracy
Want to optimize?
# When accuracy is poor, you manually add examples:
classify_prompt = f"""
Examples:
"Server down" -> support_request, critical
"Order confirmed" -> order_confirmation, low
"Meeting tomorrow" -> meeting_invitation, medium
Now classify: {subject}
"""
# Still brittle, still manual...
import dspy
class EmailProcessor(dspy.Module):
def __init__(self):
# Define WHAT you want, not HOW to prompt for it
self.classifier = dspy.ChainOfThought(ClassifyEmail)
self.entity_extractor = dspy.ChainOfThought(ExtractEntities)
self.action_generator = dspy.ChainOfThought(GenerateActionItems)
self.summarizer = dspy.ChainOfThought(SummarizeEmail)
def forward(self, email_subject, email_body, sender):
# Compose modules together - DSPy handles the prompting
classification = self.classifier(
email_subject=email_subject,
email_body=email_body,
sender=sender
)
entities = self.entity_extractor(
email_content=f"{email_subject}\n{email_body}",
email_type=classification.email_type
)
# Get structured, typed outputs automatically
return dspy.Prediction(
email_type=classification.email_type,
urgency=classification.urgency,
financial_amount=entities.financial_amount, # Proper float
important_dates=entities.important_dates, # Proper list
action_required=True if classification.urgency == "critical" else False
)
# Usage - clean and simple
processor = EmailProcessor()
result = processor(
"URGENT: Server Down",
"Production is offline, need immediate help",
"[email protected]"
)
print(result.email_type) # EmailType.SUPPORT_REQUEST
print(result.urgency) # UrgencyLevel.CRITICAL
print(result.financial_amount) # None (properly typed)
Want to optimize?
# Load your email dataset
emails = load_historical_emails() # 1000 labeled emails
# Define success metric
def email_accuracy(example, prediction):
return (example.email_type == prediction.email_type and
example.urgency == prediction.urgency)
# Optimize the ENTIRE pipeline automatically
optimizer = dspy.MIPROv2(metric=email_accuracy)
optimized_processor = optimizer.compile(processor, trainset=emails)
# Optimized prompts for each module
# Handles edge cases automatically
You should probably be using DSPy.
Signatures specify the input/output behavior of a DSPy module. Any valid variable names work, the DSPy compiler will optimize the keywords.
For example, for summarization, “document -> summary”, “text -> gist”, or “long_context -> tldr” all invoke summarization.
Modules are building blocks that handle signatures and prompt configuration and can be composed into bigger modules.
These are taken directly from https://dspy.ai/learn/programming/modules/
math = dspy.ChainOfThought("question -> answer: float")
math(question="Two dice are tossed. What is the probability that the sum equals two?")
# Prediction(
# reasoning='When two dice are tossed, each die has 6 faces, resulting in a total of 6 x 6 = 36 possible outcomes. The sum of the numbers on the two dice equals two only when both dice show a 1. This is just one specific outcome: (1, 1). Therefore, there is only 1 favorable outcome. The probability of the sum being two is the number of favorable outcomes divided by the total number of possible outcomes, which is 1/36.',
# answer=0.0277776
# )
def search(query: str) -> list[str]:
"""Retrieves abstracts from Wikipedia."""
results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3)
return [x['text'] for x in results]
rag = dspy.ChainOfThought('context, question -> response')
question = "What's the name of the castle that David Gregory inherited?"
rag(context=search(question), question=question)
# Prediction(
# reasoning='The context provides information about David Gregory, a Scottish physician and inventor. It specifically mentions that he inherited Kinnairdy Castle in 1664. This detail directly answers the question about the name of the castle that David Gregory inherited.',
# response='Kinnairdy Castle'
# )
ColBERT is a fast and accurate retrieval model, enabling scalable BERT-based search over large text collections in tens of milliseconds.
from typing import Literal
class Classify(dspy.Signature):
"""Classify sentiment of a given sentence."""
sentence: str = dspy.InputField()
sentiment: Literal['positive', 'negative', 'neutral'] = dspy.OutputField()
confidence: float = dspy.OutputField()
classify = dspy.Predict(Classify)
classify(sentence="This book was super fun to read, though not the last chapter.")
# Prediction(
# sentiment='positive',
# confidence=0.75
# )
text = "Apple Inc. announced its latest iPhone 14 today. The CEO, Tim Cook, highlighted its new features in a press release."
module = dspy.Predict("text -> title, headings: list[str], entities_and_metadata: list[dict[str, str]]")
response = module(text=text)
print(response.title)
print(response.headings)
print(response.entities_and_metadata)
# Apple Unveils iPhone 14
# ['Introduction', 'Key Features', "CEO's Statement"]
# [{'entity': 'Apple Inc.', 'type': 'Organization'}, {'entity': 'iPhone 14', 'type': 'Product'}, {'entity': 'Tim Cook', 'type': 'Person'}]
def evaluate_math(expression: str) -> float:
return dspy.PythonInterpreter({}).execute(expression)
def search_wikipedia(query: str) -> str:
results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3)
return [x['text'] for x in results]
react = dspy.ReAct("question -> answer: float", tools=[evaluate_math, search_wikipedia])
pred = react(question="What is 9362158 divided by the year of birth of David Gregory of Kinnairdy castle?")
print(pred.answer)
# 5761.328
Check out my AI tools & resources reference
June 29, 2025
Late one night, a friend mentioned something that would consume me for the next day: survey marks.
Survey markers, also called survey marks, survey monuments, or geodetic marks, are objects placed to mark key survey points on the Earth’s surface. They are used in geodetic and land surveying. A benchmark is a type of survey marker that indicates elevation (vertical position). Horizontal position markers used for triangulation are also known as triangulation stations. Benchmarking is the hobby of “hunting” for these marks.
Since 1807, NOAA’s National Geodetic Survey (NGS) and its predecessor agencies have placed permanent survey marks or monuments throughout the United States so we can know exact locations and elevations on the surface of the Earth. A typical mark is a brass, bronze, or aluminum disk (or rod), but marks might also be prominent objects like water towers or church spires. The National Geodetic Survey’s database contains information on over 1.5 million survey disks, each with a detailed datasheet describing its exact position and physical characteristics.
The National Geodetic Survey Map is an ArcGIS Online Web Map Application that enables users to view multiple datasets provided by the National Geodetic Survey.
The Mark Recovery Dashboard displays mark recoveries that have been submitted to NGS.
An app about hunting these marks and tracking progress in a region would give my friends and I a reason to explore places — some nerds need nerdy nudges to navigate nature.
Geocaching.com seems to have had this feature at some point but seems to have removed the dataset (or maybe they just removed the mark page).
Benchmark Hunter is an iOS app to hunt for NGS Survey Marks released in 2021.
This seemed like the perfect excuse to answer a bigger question: In June 2025, what does AI-assisted development look like for a solo developer building something real?
With respect to Pokemon Go, I made a new directory geodetic-go
.
Information about survey monuments (aka “marks”) stored in the National Geodetic Survey’s Integrated Database (NGS IDB) may be retrieved and displayed in a variety of methods. One standard is known as a datasheet, an ASCII text file consisting of rigorously formatted lines of 80 columns
The NGS provides datasheets at the state-level.
I chucked the DSDATA format spec into Google Gemini Chat and asked it to write a parser with a focus on extracting latitude, longitude, and market type. First I had it write TypeScript — as the rest of the codebase would be. However, it kept producing non-working stuff. Then, I asked it to do it without specifying a language and it started doing it in python but I don’t want to deal with the (venv) stuff. So I told it to write it in Go and it worked first try.
type Datasheet struct {
PID string `parquet:"pid"`
Designation string `parquet:"designation"`
State string `parquet:"state"`
County string `parquet:"county"`
Latitude string `parquet:"latitude"`
Longitude string `parquet:"longitude"`
OrthometricHeight string `parquet:"orthometric_height"`
EllipsoidHeight string `parquet:"ellipsoid_height"`
MarkerType string `parquet:"marker_type"`
RawText string `parquet:"raw_text"`
}
I knew I wanted to parse and store the data in Cloudflare R2 because I enjoy the product and pricing. My first idea was to store the data in SQLite but after realizing the total text size when storing the raw text I want to display would be in the GBs — I didn’t want to send GBs down to a client — I realized I needed to change my approach. With the importance of data compression and the write-once, read-many style of data, I chose to use Apache Parquet files.
NGS provides DataSheets at the State level. However, to minimize data requirements I partitioned by county as well.
The pipeline is:
To upload into R2, I prompted Claude Code to write a script to use Rclone.
I knew I wanted to use React Router v7 SPA mode — I’ve used Remix for years and have used React Router v7 in various other projects. Vite comes with Tailwind support but I just had Claude Code write a STYLE.md
file and it starts like this
Terminal Design Features
Visual Style:
- Classic green-on-black terminal color scheme
- JetBrains Mono monospace font throughout
- Terminal-style borders and panels
- Animated blinking cursor effect
- CRT-style scan line animation
- Subtle screen grain effect
[…]
I didn’t start with this though but early on told Claude Code to redesign the frontend in this style and it did a great job.
AI models are great at writing terrible React code. Terrible, terrible, React code. I imagine this will get better over time — and there is certainly prompting improvements I can do but wow is it annoying.
I knew I wanted to use Cloudflare Workers for the backend if I could — there are a lot of limitations if you choose workers runtime but when it works it’s great and the platform is great. I chose Hono as the web framework and copied the Hono Stacks markdown documentation for Claude Code to use.
Hono’s RPC feature allows you to share API specs with little change to your code. The client generated by hc will read the spec and access the endpoint type-safety.
I love this feature but interestingly enough Claude Code wrote 100% of the backend and client API code in this project. Not without needing adjustments though.
I have yet to use Claude Code in an unchained manner — I either tell it exactly what to do or I tell it to think about how to do something, review that, and then tell it to do it and approve/deny every step of the way. With AI tools, you can adjust the input and in some cases you can poke the black-box a bit to change the output — but at the end of the day the output is still non-deterministic.
From what I’ve seen, if you do not technically understand the output you will immediately shoot yourself in the foot. If you cannot tell if the output is bad you cannot adjust it and you just dig a deeper hole in which shit is flung. Even in this project, by choosing React and deciding to move fast, and not having proper guides setup before hand, I let slide several useEffects and useStates of genuinely bad code! AI will produce many egregious suggestions but if you are knowledgeable about it you can fix it.
I used Repomix to pack the documentation for Hono.js and React Router into their own markdown files so I could tell Claude Code to search the file on how to use a certain thing. I also copy-paste specific documentation from Cloudflare into markdown files — their site has a copy-as-markdown button and it works beautifully — and tell Claude Code to read that file.
80% of the time I went Read [x] [y] [z] and think about how to implement [a]
. There’s certainly better ways of going about it but this works pretty good.
We recommend using the word “think” to trigger extended thinking mode, which gives Claude additional computation time to evaluate alternatives more thoroughly. These specific phrases are mapped directly to increasing levels of thinking budget in the system: “think” < “think hard” < “think harder” < “ultrathink.” Each level allocates progressively more thinking budget for Claude to use.
AI amplifies specific development practices. Good practices become superpowers, bad practices become disasters. The two best practices you can do right now are
packages
backend
- Hono/Cloudflare Workers APIdatasheet-downloader
- Go downloader for NGS DataSheetsdatasheet-parser
- Go parser for NGS DataSheets → Parquet filesfrontend
- React Router web applicationCheck out my AI tools & resources reference
June 30, 2025
I’ve long had a weird compulsion to learn about the various tools used by an industry or area of work and internalize the landscape they make up.
When I got into video editing years ago the simple thing would have been to stick with the first tool I came across and get to work. Instead, off the top of my head, I can still recall important feature sets, differences, and pain points between
And in design
And in marketing
And in development… well that’s a bigger rabbit hole.
I had a plumbing issue recently and learned all about manual augers, drum augers, flat tape augers, power augers… even after I knew I was only going to buy a drum auger I wanted to know the landscape.
“Conway’s Law” is a reference that you can understand systems by understanding the shape of the organization behind them.
Organizations which design systems (in the broad sense used here) are constrained to produce designs which are copies of the communication structures of these organizations.
— Melvin E. Conway, How Do Committees Invent?
I think I am drawn to exploring the tool space because it seems clear to me that you can understand work by understanding the tools used to create it.
Tools which create work are constrained to produce outputs that are copies of the structure, logic, and limitations of the tools themselves.
It seems knowledge about tools are a bus ticket of mine — I collect them and keep distinctions just for the love of it.
June 30, 2025
What if you didn’t have to be dependent on strategies, optimizations, and models to improve AI task performance?
No more watering the garden squares of “do this not that” edge cases. No more “prompt engineering.”
Just program logic and optimization metrics.
Enter: DSPy
prompt = "You are a helpful assistant. Answer this math question step by step: Two dice are tossed. What is the probability that the sum equals two?"
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
# Yes a dumb case. But string-based prompting performance is brittle.
# Declarative, structured approach
import dspy
# Configure your language model
dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))
# Define behavior through signatures
math = dspy.ChainOfThought("question -> answer: float")
# Use the module - DSPy handles prompting automatically
result = math(question="Two dice are tossed. What is the probability that the sum equals two?")
# Get structured output with reasoning
print(result.reasoning) # Step-by-step explanation
print(result.answer) # 0.0277776
Okay, this example didn’t mean much to me when I first saw it. Let’s see a more real-world use case: extracting information from emails.
def process_email_old_way(subject, body, sender):
# Separate prompts for each task - brittle and hard to maintain
# Email classification
classify_prompt = f"""
Classify this email as one of: order_confirmation, support_request, meeting_invitation, newsletter, promotional, invoice, shipping_notification, other
Subject: {subject}
Body: {body}
Sender: {sender}
Classification:"""
classification = call_openai(classify_prompt)
# Entity extraction - different prompt structure
extract_prompt = f"""
Extract the following from this email:
- Financial amounts (format: $X.XX)
- Important dates (format: MM/DD/YYYY)
- Contact information
- Action items
Email: {subject} {body}
Extracted info:"""
entities = call_openai(extract_prompt)
# Urgency detection - yet another prompt
urgency_prompt = f"""
Rate the urgency of this email from 1-4:
1=low, 2=medium, 3=high, 4=critical
Consider: {subject}
Urgency level:"""
urgency = call_openai(urgency_prompt)
# Manual parsing hell
try:
# Hope the LLM returned exactly what we expected...
classification = classification.strip().lower()
urgency_num = int(urgency.strip())
# Parse entities with regex and prayer
amounts = re.findall(r'\$[\d,]+\.?\d*', entities)
dates = re.findall(r'\d{1,2}/\d{1,2}/\d{4}', entities)
return {
'type': classification,
'urgency': urgency_num,
'amounts': amounts,
'dates': dates
}
except:
# When it inevitably breaks...
return {'error': 'Parsing failed'}
# Problems:
# - 4 separate API calls (slow, expensive)
# - Fragile string parsing
# - No consistency between outputs
# - Breaks when switching models
# - Manual prompt engineering for each task
# - No systematic way to improve accuracy
Want to optimize?
# When accuracy is poor, you manually add examples:
classify_prompt = f"""
Examples:
"Server down" -> support_request, critical
"Order confirmed" -> order_confirmation, low
"Meeting tomorrow" -> meeting_invitation, medium
Now classify: {subject}
"""
# Still brittle, still manual...
import dspy
class EmailProcessor(dspy.Module):
def __init__(self):
# Define WHAT you want, not HOW to prompt for it
self.classifier = dspy.ChainOfThought(ClassifyEmail)
self.entity_extractor = dspy.ChainOfThought(ExtractEntities)
self.action_generator = dspy.ChainOfThought(GenerateActionItems)
self.summarizer = dspy.ChainOfThought(SummarizeEmail)
def forward(self, email_subject, email_body, sender):
# Compose modules together - DSPy handles the prompting
classification = self.classifier(
email_subject=email_subject,
email_body=email_body,
sender=sender
)
entities = self.entity_extractor(
email_content=f"{email_subject}\n{email_body}",
email_type=classification.email_type
)
# Get structured, typed outputs automatically
return dspy.Prediction(
email_type=classification.email_type,
urgency=classification.urgency,
financial_amount=entities.financial_amount, # Proper float
important_dates=entities.important_dates, # Proper list
action_required=True if classification.urgency == "critical" else False
)
# Usage - clean and simple
processor = EmailProcessor()
result = processor(
"URGENT: Server Down",
"Production is offline, need immediate help",
"[email protected]"
)
print(result.email_type) # EmailType.SUPPORT_REQUEST
print(result.urgency) # UrgencyLevel.CRITICAL
print(result.financial_amount) # None (properly typed)
Want to optimize?
# Load your email dataset
emails = load_historical_emails() # 1000 labeled emails
# Define success metric
def email_accuracy(example, prediction):
return (example.email_type == prediction.email_type and
example.urgency == prediction.urgency)
# Optimize the ENTIRE pipeline automatically
optimizer = dspy.MIPROv2(metric=email_accuracy)
optimized_processor = optimizer.compile(processor, trainset=emails)
# Optimized prompts for each module
# Handles edge cases automatically
You should probably be using DSPy.
Signatures specify the input/output behavior of a DSPy module. Any valid variable names work, the DSPy compiler will optimize the keywords.
For example, for summarization, “document -> summary”, “text -> gist”, or “long_context -> tldr” all invoke summarization.
Modules are building blocks that handle signatures and prompt configuration and can be composed into bigger modules.
These are taken directly from https://dspy.ai/learn/programming/modules/
math = dspy.ChainOfThought("question -> answer: float")
math(question="Two dice are tossed. What is the probability that the sum equals two?")
# Prediction(
# reasoning='When two dice are tossed, each die has 6 faces, resulting in a total of 6 x 6 = 36 possible outcomes. The sum of the numbers on the two dice equals two only when both dice show a 1. This is just one specific outcome: (1, 1). Therefore, there is only 1 favorable outcome. The probability of the sum being two is the number of favorable outcomes divided by the total number of possible outcomes, which is 1/36.',
# answer=0.0277776
# )
def search(query: str) -> list[str]:
"""Retrieves abstracts from Wikipedia."""
results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3)
return [x['text'] for x in results]
rag = dspy.ChainOfThought('context, question -> response')
question = "What's the name of the castle that David Gregory inherited?"
rag(context=search(question), question=question)
# Prediction(
# reasoning='The context provides information about David Gregory, a Scottish physician and inventor. It specifically mentions that he inherited Kinnairdy Castle in 1664. This detail directly answers the question about the name of the castle that David Gregory inherited.',
# response='Kinnairdy Castle'
# )
ColBERT is a fast and accurate retrieval model, enabling scalable BERT-based search over large text collections in tens of milliseconds.
from typing import Literal
class Classify(dspy.Signature):
"""Classify sentiment of a given sentence."""
sentence: str = dspy.InputField()
sentiment: Literal['positive', 'negative', 'neutral'] = dspy.OutputField()
confidence: float = dspy.OutputField()
classify = dspy.Predict(Classify)
classify(sentence="This book was super fun to read, though not the last chapter.")
# Prediction(
# sentiment='positive',
# confidence=0.75
# )
text = "Apple Inc. announced its latest iPhone 14 today. The CEO, Tim Cook, highlighted its new features in a press release."
module = dspy.Predict("text -> title, headings: list[str], entities_and_metadata: list[dict[str, str]]")
response = module(text=text)
print(response.title)
print(response.headings)
print(response.entities_and_metadata)
# Apple Unveils iPhone 14
# ['Introduction', 'Key Features', "CEO's Statement"]
# [{'entity': 'Apple Inc.', 'type': 'Organization'}, {'entity': 'iPhone 14', 'type': 'Product'}, {'entity': 'Tim Cook', 'type': 'Person'}]
def evaluate_math(expression: str) -> float:
return dspy.PythonInterpreter({}).execute(expression)
def search_wikipedia(query: str) -> str:
results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3)
return [x['text'] for x in results]
react = dspy.ReAct("question -> answer: float", tools=[evaluate_math, search_wikipedia])
pred = react(question="What is 9362158 divided by the year of birth of David Gregory of Kinnairdy castle?")
print(pred.answer)
# 5761.328
Check out my AI tools & resources reference
June 29, 2025
Late one night, a friend mentioned something that would consume me for the next day: survey marks.
Survey markers, also called survey marks, survey monuments, or geodetic marks, are objects placed to mark key survey points on the Earth’s surface. They are used in geodetic and land surveying. A benchmark is a type of survey marker that indicates elevation (vertical position). Horizontal position markers used for triangulation are also known as triangulation stations. Benchmarking is the hobby of “hunting” for these marks.
Since 1807, NOAA’s National Geodetic Survey (NGS) and its predecessor agencies have placed permanent survey marks or monuments throughout the United States so we can know exact locations and elevations on the surface of the Earth. A typical mark is a brass, bronze, or aluminum disk (or rod), but marks might also be prominent objects like water towers or church spires. The National Geodetic Survey’s database contains information on over 1.5 million survey disks, each with a detailed datasheet describing its exact position and physical characteristics.
The National Geodetic Survey Map is an ArcGIS Online Web Map Application that enables users to view multiple datasets provided by the National Geodetic Survey.
The Mark Recovery Dashboard displays mark recoveries that have been submitted to NGS.
An app about hunting these marks and tracking progress in a region would give my friends and I a reason to explore places — some nerds need nerdy nudges to navigate nature.
Geocaching.com seems to have had this feature at some point but seems to have removed the dataset (or maybe they just removed the mark page).
Benchmark Hunter is an iOS app to hunt for NGS Survey Marks released in 2021.
This seemed like the perfect excuse to answer a bigger question: In June 2025, what does AI-assisted development look like for a solo developer building something real?
With respect to Pokemon Go, I made a new directory geodetic-go
.
Information about survey monuments (aka “marks”) stored in the National Geodetic Survey’s Integrated Database (NGS IDB) may be retrieved and displayed in a variety of methods. One standard is known as a datasheet, an ASCII text file consisting of rigorously formatted lines of 80 columns
The NGS provides datasheets at the state-level.
I chucked the DSDATA format spec into Google Gemini Chat and asked it to write a parser with a focus on extracting latitude, longitude, and market type. First I had it write TypeScript — as the rest of the codebase would be. However, it kept producing non-working stuff. Then, I asked it to do it without specifying a language and it started doing it in python but I don’t want to deal with the (venv) stuff. So I told it to write it in Go and it worked first try.
type Datasheet struct {
PID string `parquet:"pid"`
Designation string `parquet:"designation"`
State string `parquet:"state"`
County string `parquet:"county"`
Latitude string `parquet:"latitude"`
Longitude string `parquet:"longitude"`
OrthometricHeight string `parquet:"orthometric_height"`
EllipsoidHeight string `parquet:"ellipsoid_height"`
MarkerType string `parquet:"marker_type"`
RawText string `parquet:"raw_text"`
}
I knew I wanted to parse and store the data in Cloudflare R2 because I enjoy the product and pricing. My first idea was to store the data in SQLite but after realizing the total text size when storing the raw text I want to display would be in the GBs — I didn’t want to send GBs down to a client — I realized I needed to change my approach. With the importance of data compression and the write-once, read-many style of data, I chose to use Apache Parquet files.
NGS provides DataSheets at the State level. However, to minimize data requirements I partitioned by county as well.
The pipeline is:
To upload into R2, I prompted Claude Code to write a script to use Rclone.
I knew I wanted to use React Router v7 SPA mode — I’ve used Remix for years and have used React Router v7 in various other projects. Vite comes with Tailwind support but I just had Claude Code write a STYLE.md
file and it starts like this
Terminal Design Features
Visual Style:
- Classic green-on-black terminal color scheme
- JetBrains Mono monospace font throughout
- Terminal-style borders and panels
- Animated blinking cursor effect
- CRT-style scan line animation
- Subtle screen grain effect
[…]
I didn’t start with this though but early on told Claude Code to redesign the frontend in this style and it did a great job.
AI models are great at writing terrible React code. Terrible, terrible, React code. I imagine this will get better over time — and there is certainly prompting improvements I can do but wow is it annoying.
I knew I wanted to use Cloudflare Workers for the backend if I could — there are a lot of limitations if you choose workers runtime but when it works it’s great and the platform is great. I chose Hono as the web framework and copied the Hono Stacks markdown documentation for Claude Code to use.
Hono’s RPC feature allows you to share API specs with little change to your code. The client generated by hc will read the spec and access the endpoint type-safety.
I love this feature but interestingly enough Claude Code wrote 100% of the backend and client API code in this project. Not without needing adjustments though.
I have yet to use Claude Code in an unchained manner — I either tell it exactly what to do or I tell it to think about how to do something, review that, and then tell it to do it and approve/deny every step of the way. With AI tools, you can adjust the input and in some cases you can poke the black-box a bit to change the output — but at the end of the day the output is still non-deterministic.
From what I’ve seen, if you do not technically understand the output you will immediately shoot yourself in the foot. If you cannot tell if the output is bad you cannot adjust it and you just dig a deeper hole in which shit is flung. Even in this project, by choosing React and deciding to move fast, and not having proper guides setup before hand, I let slide several useEffects and useStates of genuinely bad code! AI will produce many egregious suggestions but if you are knowledgeable about it you can fix it.
I used Repomix to pack the documentation for Hono.js and React Router into their own markdown files so I could tell Claude Code to search the file on how to use a certain thing. I also copy-paste specific documentation from Cloudflare into markdown files — their site has a copy-as-markdown button and it works beautifully — and tell Claude Code to read that file.
80% of the time I went Read [x] [y] [z] and think about how to implement [a]
. There’s certainly better ways of going about it but this works pretty good.
We recommend using the word “think” to trigger extended thinking mode, which gives Claude additional computation time to evaluate alternatives more thoroughly. These specific phrases are mapped directly to increasing levels of thinking budget in the system: “think” < “think hard” < “think harder” < “ultrathink.” Each level allocates progressively more thinking budget for Claude to use.
AI amplifies specific development practices. Good practices become superpowers, bad practices become disasters. The two best practices you can do right now are
packages
backend
- Hono/Cloudflare Workers APIdatasheet-downloader
- Go downloader for NGS DataSheetsdatasheet-parser
- Go parser for NGS DataSheets → Parquet filesfrontend
- React Router web applicationCheck out my AI tools & resources reference