Effective Use-Cases for LLMs

tcbrah1 pts0 comments

Effective use-cases for LLMs - Aggressively Paraphrasing Me

Skip to content

Effective use-cases for LLMs

Written by

Mark

in

Blog

There’s a lot of talk about the shortcomings of LLMs. They don’t actually reason. They’re expensive, especially when running in a loop. They’re quite slow at doing things.

There’s a narrow category of use cases that LLMs excel at, one of which is "sifting through the noise". The noise is everything we have to process to get to what we really want.

Here are some use cases I haven’t heard about that I’ve enjoyed as a software engineer.

Searching through Customer Conversations

A PM colleague uploaded the transcript of every call with our top customers into an Embedding DB. Now their product proposals are deeply backed by evidence. We know 40% of our top customers have mentioned this pain point. The PM also identified a list of eager private beta customers to try out our new feature.

This is useful when the customer’s problem is abstract. Often, these issues don’t have clear solutions, or those solutions don’t have clear names. That makes filing Feature Requests hard, and organizing/deduping even harder. Before LLMs, your best bet was that someone on your team had enough tenure to have seen this come up enough times, and that they remembered how to find all the links and connections. Now, it’s RAG.

Going from endpoint alert -> log analysis

Any large system is going to be operating most of the time in failure mode.

— John Gall, via Lorin Hochstein, Netflix

When I’m on-call, one of my responsibilities is to triage failures on API endpoint out team owns. These failures are reported as “high rate of HTTP 4XX/5XX”. Sometimes, it’s noise, like there’s a DB connection hiccup for the pod. Other times, it’s signaling a bug, like customers can’t delete something anymore.

Triaging is tedious:

The first step is searching for the canonical log lines that mark the specific endpoint with the specific HTTP failure, filtered by time.

Once I find the request that triggered the alert, I search by request ID, to see the request from start to end. Based on the logs, and my source code, I can usually guess what went wrong.

Sometimes the stack trace is compiled JavaScript, rather than Typescript, so the line numbers don’t line up. I have to guess based on the name of the next function call.

I double-check that I’m looking at a representative request. I quickly look at two or three more request IDs to make sure they’re all the same root cause.

For more difficult issues like DB connection timeouts, I’ll see if there’s clustering on the canonical log lines around timestamp, host machine, customer ID. Maybe it’s not specifically my route, but an infra issue.

All in all, there’s a lot of stuff to sift through. There’s so much judgment required, and I haven’t even found the problem, let alone thought about a solution yet.

Yet, an agent harness is almost perfect for this. Given some alert and timestamp, point me in the right direction: logs, source code, or clustering. This has cut my triaging time from 15+ minutes to 1-2 minutes per issue. You don’t even need the SotA ($$$$) models. Save your money, use a faster model.

I published this workflow as a skill for my teammates with the intention of sharing the actual human skill involved. The output names all the queries it tried, categorized into informative or non-informative, with links to dig deeper. I don’t want it to be magical, because I want my teammates to know how to think about triaging. I also want it to be a ramp to independent discovery.

Shortening Content

I specifically didn’t call this summarizing because:

ChatGPT doesn’t summarise. When I asked ChatGPT to summarise this text, it instead shortened the text.

— https://ea.rna.nl/2024/05/27/when-chatgpt-summarises-it-actually-does-nothing-of-the-kind/

But despite all that, I still find incredible value in shortening texts! I’ll sometimes get recommended a podcast or video that’s over 1 hour long. Sometimes, I’m hooked within the first 5 minutes. But for technical content, my interest is often buried deep in the video, maybe 30 minutes in for recorded talks. I don’t want to spend that much time figuring out if something is interesting to me, and LLMs greatly help with that.

In my experience, if there’s enough interesting content in the shortened version, there’s plenty in the unshortened version. One video casually mentioned east-coast vs west-coast programming in the US. Without shortening, I would have stopped watching 19 seconds earlier out of disinterest.

Transcribing

Okay, summarizing is really useful to me, but how do I get it to work on videos and podcasts? I made myself a little automation that, given a link, will check:

If there are subtitles, download that

If it’s a video, download the audio for transcribing

If it’s audio, transcribe

Once I transform slow video or audio formats into text, I can summarize!

I say all this with the caveat that maybe this is a...

llms cases want request video content

Related Articles