Marco
Serenelli.
AI Engineer
Iamones, since 2024
Building Agents, Retrieval Systems, and evaluating them in production, where real users break things.
§01
Index.
A map of what's here. Start anywhere.
§02
Built.
Things built, things shipped.
- 01Identity for AI Agents2026
handling non-human identities
iam · agents · iamones
AI agents are becoming a new identity category in the enterprise, and the classical IAM graph wasn't designed for them. Thinking about what access review, SoD, and lifecycle management look like when the identity being governed is non-human but can reason and call tools.
- 02AI Agent Workflow (HITL)2025
minutes not days · human-in-the-loop
agents · langgraph · iamones
A LangGraph workflow for actionable IAM requests, "assign me this role," "revoke that access," with the agent ensuring correctness and a human-in-the-loop step ensuring safety. Checkpoints sit on the irreversible steps, not the low-confidence ones.
- 03LLM-SQL Generator2024
500+ requests served · self-serve analytics
nl2sql · iamones
Natural-language-to-SQL over the IAM warehouse, so auditors and access-review managers can ask things like "Who has access to SAP?" without being technical. The hard part was safety and guardrails, keeping the context tight enough to stay reliable but flexible enough to handle real questions.
- 04Stress Detection Research2024
95.8% accuracy · HC@AIxIA 2024
research · published
Instead of training RNNs on raw physiological time series, we encoded the signals as images (Gramian Angular Field, Markovian Transition Field, Recurrent Plot) and classified them with CNNs. Sidestepped the compute cost and gradient instabilities of recurrent models. GAF won on WESAD; presented at HC@AIxIA 2024.
§03
About.
The short version. Rome, LLMs, production systems.
AI engineer in Rome. Most days I'm wiring LLMs into production at Iamones, building retrieval pipelines, agents, and the eval loops that keep them honest. The interesting part isn't the model itself, but everything around it that decides whether the thing actually works for a real user on a Tuesday afternoon.
I learned that the hard way. A double Master's across Reykjavik and Camerino ended in a thesis on detecting stress from physiological data, months of cleaning noisy sensor readings before any model could touch them. That's where I learned a clean experiment is worth more than a clever model, and it shapes how I think about LLM systems now: the hard part isn't the model, it's everything upstream and downstream of it.
Reykjavik was also where NLP got its hooks in. My first real project was something small and classic, classifying emails as spam or not, probably with a BERT, I honestly don't remember. What I do remember is watching a classifier come alive on a real dataset and thinking, oh, this is crazy. I've been chasing that feeling ever since, through agents, RAG, and the whole orchestration layer that turns a model into something that actually does useful work.
A decent chunk of every week goes to reading. Papers that take two passes to click, GitHub repos I fall into for an hour, notes from people building similar things from different angles. The field moves fast enough that last year's intuition is usually outdated this year, so staying curious isn't optional.
Most papers don't survive contact with a real dataset, and the ones that do rarely survive in the shape the authors intended. That's the fun part. You read something on Saturday, hack a rough version by midweek, and by Friday you know which 10% of the idea actually holds up. Most of what I end up on Substack writing about is that 10%, me working through something until I figure out what I actually think about it.
I care about systems that ship, code reviews that teach, and the small decisions that nobody notices when they're right. When to let the LLM decide and when to hard-code the path, what you name the thing, where the prompt ends and the code begins. If any of that sounds like the work you're doing, the email's in the colophon.
- 01Claude Has Functional Emotions. That Changes Things.
What Anthropic found when they looked inside their own model
13 APR 2026 - 02What If We Stopped Predicting and Started Simulating?
How MiroFish is betting on emergence over equations
20 MAR 2026 - 03LLMs Don't Understand The World
Why predicting the next word was never going to be enough
27 FEB 2026 - 04Faster Might Not Mean Better
Reviewing an Anthropic study on AI-Assisted coding and learning
25 FEB 2026
§05
Colophon.
Credits, contact, and the back page.
This is the end. If you're still here, you read everything, and that's rare. Thanks.