Skip to content

Writing and mumblings

I don't focus on maintaining high writing quality; instead, I prioritize writing frequently. My goal is to maximize the number of words written per month, with the hope that eventually, impressions and impact will align. Some of these links will be videos, some will be tweets, and some will be blog posts. I hope you find something valuable.

Questions?

  • If you have topics you'd like me to write about leave a comment in my discussions

Writing

Systems

Advice to Young People, The Lies I Tell Myself

I'm really not qualified to give advice. But enough people DM'd me on Twitter, so here it is. I don't have to answer the same question over and over again. After some more editing I realised that I am actually writing this for my younger Katherine.

If you want to know who I am, check out blog/whoami or my Twitter.

Don't read this if you're seeking a nuanced perspective

These are simply the lies I tell myself to keep on living my life in good faith. I'm not saying this is the right way to do things. I'm just saying this is how I did things. I will do my best to color my advice with my own experiences, but I'm not going to pretend that the suffering and the privilege I've experienced is universal.

How to build a terrible RAG system

If you've seen any of my work, you know that the main message I have for anyone building a RAG system is to think of it primarily as a recommendation system. Today, I want to introduce the concept of inverted thinking to address how we should approach the challenge of creating an exceptional system.

What is inverted thinking?

Inversion is the practice of thinking through problems in reverse. It's the practice of “inverting” a problem - turning it upside down - to see it from a different perspective. In its most powerful form, inversion is asking how an endeavor could fail, and then being careful to avoid those pitfalls. [1]

Who am I?

In the next year, this blog will be painted with a mix of technical machine learning content and personal notes. I've spent more of my 20s thinking about my life than machine learning. I'm not good at either, but I enjoy both.

Life story

I was born in a village in China. My parents were the children of rural farmers who grew up during the Cultural Revolution. They were the first generation of their family to read and write, and also the first generation to leave the village.

With the advent of large language models (LLM), retrival augmented generation (RAG) has become a hot topic. However throught the past year of helping startups integrate LLMs into their stack I've noticed that the pattern of taking user queries, embedding them, and directly searching a vector store is effectively demoware.

What is RAG?

Retrival augmented generation (RAG) is a technique that uses an LLM to generate responses, but uses a search backend to augment the generation. In the past year using text embeddings with a vector databases has been the most popular approach I've seen being socialized.

RAG

Simple RAG that embedded the user query and makes a search.

So let's kick things off by examining what I like to call the 'Dumb' RAG Model—a basic setup that's more common than you'd think.

Kojima's Philosophy in LLMs: From Sticks to Ropes

Hideo Kojima's unique perspective on game design, emphasizing empowerment over guidance, offers a striking parallel to the evolving world of Large Language Models (LLMs). Kojima advocates for giving players a rope, not a stick, signifying support that encourages exploration and personal growth. This concept, when applied to LLMs, raises a critical question: Are we merely using these models as tools for straightforward tasks, or are we empowering users to think critically and creatively?

Good LLM Observability is just plain observability

In this post, I aim to demystify the concept of LLM observability. I'll illustrate how everyday tools employed in system monitoring and debugging can be effectively harnessed to enhance AI agents. Using Open Telemetry, we'll delve into creating comprehensive telemetry for intricate agent actions, spanning from question answering to autonomous decision-making.

What is Open Telemetry?

Essentially, Open Telemetry comprises a suite of APIs, tools, and SDKs that facilitate the creation, collection, and exportation of telemetry data (such as metrics, logs, and traces). This data is crucial for analyzing and understanding the performance and behavior of software applications.

Centuar Chess: AI as a Collaborative Partner

This is a experimental post, please leave feedback in the comments below

This was a essay written by ChatGPT given a quick transcript of a 5 minute mono-logue. The goal was to see if I could use ChatGPT to write a blog post. I think it did a pretty good job, but I'll let you be the judge.

It is my intention that by the end you'll understand that AI is not a threat to human intelligence, but rather a tool that can be used to augment human creativity and productivity.

Freediving under ice

Growing up, I wasn't very physically active. However, as I got older and had more time, I made a conscious effort to get in shape and improve my relationship with my body. During the Covid pandemic, I developed RSI and my thumbs from coding too much, which prevented me from participating in any sports.

Recommendations with Flight at Stitch Fix

As a data scientist at Stitch Fix, I faced the challenge of adapting recommendation code for real-time systems. With the absence of standardization and proper performance testing, tracing, and logging, building reliable systems was a struggle.

To tackle these problems, I created Flight – a framework that acts as a semantic bridge and integrates multiple systems within Stitch Fix. It provides modular operator classes for data scientists to develop, and offers three levels of user experience.